Histogram Matching

Earth Engine by Example

Noel Gorelick
Apr 15 · 4 min read

Histogram matching is a quick and easy way to “calibrate” one image to match another. In mathematical terms, it’s the process of transforming one image so that the cumulative distribution function (CDF) of values in each band matches the CDF of bands in another image.

To illustrate what this looks like and how it works, I’m going to histogram-match a high-resolution (0.8m/pixel) SkySat image to the Landsat 8 calibrated surface reflectance images taken around the same time. Below is what it looks like with the SkySat image overlaid on top of the Landsat data, before the matching. Each image’s histogram is shown as well:

SkySat image swath overlaid on Landsat 8 image (top); cumulative histogram for SkySat (left) and Landsat 8 surface reflectance (right).

To make the histograms match, we can interpolate the values from the source image (SkySat) into the range of the target image (Landsat), using a piecewise-linear function that puts the correct ratio of pixels into each bin.

Below is the code to generate the piecewise-linear function using the cumulative count histograms of each image.

This code starts by splitting the (2D Array) histograms into the pixel values (column 0) and pixel counts (column 1), and normalizes the counts by dividing by the total count (the last value).

Next, for each source bin, it finds the index of the first bin in the target histogram where targetCount ≥ srcCount[i]. To determine that, we compare each value from sourceCount to the entire array of targetCounts. This comparison generates an array of 0s where the comparison is false and 1s where the comparison is true. The index of the first non-zero value can be found using the Array.argmax() function, and using that index, we can determine the targetValue that each srcValue should be adjusted to. The output of this function is formatted as a dictionary that’s suitable for passing directly into the Image.interpolate() function.

Next, here’s the code for generating the histograms and adjusting the images.

This code runs a reduceRegion() on each image to generate a cumulative histogram, making sure that only pixels that are in both images are included when computing the histograms (just in case there might be a cloud or something else just outside of the high-res image, that might distort the results). It’s not important to generate that histogram with a really high fidelity, so the maxPixels argument is set to use less than “4 tiles” of data (256 * 256 * 4) and bestEffort is turned on, to make the computation run fast. When these arguments are set this way, the reduceRegion() function will try to figure out how many pixels it would need to process at the given scale (30 m), and if that’s greater than the maxPixels value, it computes a lower scale to keep the total number of pixels below maxPixels. That all means you need to specify a scale, but it doesn’t matter what it is as it’ll be mostly ignored.

This code then generates the lookup tables for each band in the input image, calls the interpolate() function for each, and combines the results into a single image.

The final results look like this. Until you zoom in really far, it’s nearly impossible to tell where the Landsat image ends and the Skysat image begins.

Final color-matched SkySat image overlaying Landsat 8 surface reflectance

Here’s a close up of a detailed region in each of the 3 images:

A close up showing the original Landsat data (left), the unmodified SkySat image (middle) and the color-matched result (right)

The whole script has some additional functions to find the images closest in time, but as you can see, the histogram matching is pretty easy and usually does a great job overall.

https://code.earthengine.google.com/0dd13bac8afbae60d424b65e11f69232

  • If there’s anything anomalous in your image that’s not in the reference image (or vice versa), like clouds, the CDF can end up skewed, and the histogram matching results might not look that good.
  • A little mis-registration between the source and target images is usually ok, since it is using the statistics of the whole region and doesn’t really rely on a pixel-to-pixel correspondence.

Google Earth and Earth Engine

For developers, scientists, explorers and storytellers

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store