I may be able to shed some light on what’s known as “data calibration” because I read a lot about climate change in general.
Typically what’s known as data calibration is simply correcting known anomalies within the data. For example, with satellite temperature readings, they are affected by their orbits and whether they’re decaying or steady and so these have to be accounted for by calibrating the data.
In terms of ground-base thermometers, corrections are made for Time of Day (TOD)bias since time of collection used to different than it is now, but now whatever time they collect is a global standard so modern temps should have no TOD bias, but old recorded temps will.
Another correction is for the Urban Heat Island (UHI) affect, but pretty much what each “bureau of meteorology” choose how much to correct certain thermometers based on a chosen UHI number but since UHI can be up to 7⁰ F different then the rural temperatures, but not always that much depending on the urban setting and that thermometer position, this is highly variable and they tend to correct only with 1 choice of temperature for every heat island instead of a variable number based on real local factors.
The third one I’m familiar with is extreme anomalies. These are temperatures that are way outside the norm for a station and I believe this is verified by checking this temperature of other nearby stations before homogenizing/removing it (Australia’s BOM recently got caught with removing less than 10C temps from some thermometers in real time). Of course, this is usually brushed aside by an issue with the thermostat, however, due to the spacing of thermometers, it could just be that some local factors actually caused that anomaly, but due to the prevalence on both extremes, homogenizing these out probably has no overall affect on global temperatures if done correctly.
The last correction that I’m aware of is what I’d call “spreading the love.” This essentially means extrapolating temperature data to areas where we don’t really have any data. This pretty much only applies to Africa, South America (only in the Amazon rainforest areas, but the rest of it’s pretty well covered), and the oceans. Africa has hardly any thermostats where many are 1000’s of KM apart whereas first world countries are well covered with small spacing between thermostats, taking accurate temperatures in the Amazon is pretty hard because if they weren’t under the canopy, it’d only represent temperature increased due to land use instead of local temperatures, but under the canopy would also be slightly cooler because it blocks a lot of sunlight, and lastly with oceans we have a number of floats and ships taking temperatures, but it’s not even close to global coverage and a lot of the temperatures are extrapolated to way larger areas than is reasonable and then they also are likely to include land-based thermostats when land vs water reacts to solar heating very differently for air temperatures.
Those are what I believe mostly goes into what’s known as Data Calibration and I hope it helps. I’m also happy to be proven wrong if I’m incorrect.