How to Co-Register Temporal Stacks of Satellite Images

Published in

Planet Stories

9 min readJan 18, 2023

Written by Devis Peressutti. Work performed by Nejc Vesel, Matic Lubej, Nika Oman Kadunc, Matej Batič, Sara Verbič, Žiga Lukšič, Jan Geršak and Devis Peressutti.

Geometric errors cause misalignments between consecutive satellite image acquisitions, which in turn impair land monitoring and change detection analysis. In this blog post we investigate how we can robustly co-register a temporal stack of optical satellite images to reduce the impact of such misalignments.

Figure 1. Sentinel-2 L1C time lapse of a road in eastern Slovenia. Time period ranges from March 2021 to August 2021. Contains Copernicus Sentinel-2 data modified with Sentinel Hub 2022.

Have you ever seen the amazing Sentinel-2 time-lapses the users of Sentinel Hub create? They tell incredible stories about how Earth is changing, and they are the stepping stone for many monitoring and change detection systems. If you have seen the time lapses or ever created one, you have also noticed the characteristic flickering occurring between time frames. This is due to errors affecting the geo-location of images and is typically a combination of aleatory and systematic errors. These errors are well characterized and monitored by satellite manufacturers, as you can see in the Sentinel-2 data quality report. The extracts below show the geometric errors for Sentinel-2A and Sentinel-2B for unrefined products (processed before 01–2022) and refined products (processed after 01–2022). The document reports a 12 m mean circular shift for unrefined products and a 4m mean circular shift for refined products at 95.5 % confidence. These numbers are relatively low if you consider the 10 m pixel size of Sentinel-2, and the complexity of the acquisition and ortho-rectification processes.

Figure 2. Distribution of mean circular error for Sentinel-2A and 2B satellites for unrefined products, i.e. processed before 1st January 2022.

Figure 3. Distribution of mean circular error for Sentinel-2A and 2B satellites for refined products, i.e. processed after 1st January 2022.

However, in some cases, the misalignments can affect the analysis of the temporal variations. This is particularly true for objects of size comparable to the circular error, as you can see in the animation above. Analysis of roads, bridges and buildings is in proportion more impacted by geo-location errors than larger objects, such as forests or larger agricultural parcels. These errors are inherent to the positioning and ortho-rectification process and can be corrected up to a level. They are more noticeable in data collections with high temporal resolution that enable land monitoring, such as Sentinel-2, Landsat, and PlanetScope.

We have built quite a few machine learning (ML) products for land and area monitoring, change-detection, and temporal analysis of satellite imagery. We have noticed that these geometric errors are a contributing source of noise in our results. As an example, our field delineation algorithm performs a pixel-wise temporal merging of predictions to derive a more robust estimate of where the boundary is. Geometric errors introduce fictitious boundary changes that affect the final result, as shown below.

Figure 4. Changes in probability of the arrowed pixel being a field boundary pixel or not. The fluctuations in probability are due to geo-location errors.

We therefore set out to develop a workflow to compensate for geometric errors. We wanted the process to be agnostic to the data collection, to work over a large time interval and be robust to large changes in land cover. We determined that image-based co-registration was the best way to achieve all of this.

The aim of co-registration is to estimate the motion occurring between two images, here referred as the template and input image. The estimated motion can be applied to align the input image to the template. By repeating this process for all time frames, we obtain a temporal stack which should be aligned to the template image. In a nutshell, image-based registration is an optimization algorithm where the input image is iteratively transformed with a parameterized motion model until the two images look the same, according to an objective function. Examples of motion models range from translation, where the input image can only shift in the x and y direction, to affine transformations, where rotation, scaling and skewing are also modelled. In addition, non-rigid deformations can also be modelled, adding degrees of freedom and complexity to the optimization. As for the objective function, the most common ones are based on the mean square difference, cross-correlation or mutual information, depending on the data collections being co-registered.

We ran many experiments and tested different algorithms and parameters, so you don’t have to! Among the things we explored were:

which template image should be used, e.g. a random time frame, the average of the time frames;
which band or band combination carries the best information to register images over a large time interval;
which motion model gives the best trade-off between simplicity, convergence speed and accuracy;
which metrics should be tracked to evaluate the performance of the algorithm, given that ground-truth information is not available;
which objective function is more appropriate for optical satellite imagery.

We carried out the experiments independently for Sentinel-2 and PlanetScope time-series, and we found that the same workflow and parameter combination showed optimal results for both data collections. To evaluate the performance, we looked at improvements in image similarity (i.e. Mean-Squared Error and Structural Similarity Index) of the registered stack over the unregistered with respect to the template image. We also looked at the distribution of the estimated circular shifts and at the execution time of the algorithm. In addition, as we focused on field delineation, we tracked the variation of the probability of pixels being boundary for registered and unregistered stacks. The assumption is that the registered stack would show less variation of the probabilities.

We considered an area-of-interest (AOI) in the eastern part of Slovenia with an abundance of small agricultural parcels.

The winning parameter combo was found to be the following:

use the average temporal image as template image. This choice can be confusing, especially because over a large time interval the average image is stripped of real features. However, the average image represents the best, although noisy, estimate of the true location, and has some degree of similarity to all frames;
use the gradient of the averaged RGB channels. The co-registration algorithm uses a single band for its optimization, so we investigated which band alone or a combination of them would lead to robust results. An intensity image created by averaging the R, G and B bands was tested, with performance superior to any single band considered separately. In addition, we tested using the gradient of the values, to reduce the impact of different acquisition conditions, like lighting and contrast. The usage of gradient boosted the performance, as it also focuses the registration on aligning features of interest like edges and contours;
use translation only motion model. We have seen that adding rotation or scale slows the optimization without bringing additional improvements in the tracked metrics. Naturally though, residual non-rigid misalignments can be found in the registered stack. Note that the choice of most suitable model might be dependent on the size of the scene being co-registered;
after testing a number of implementations using different algorithms and objective functions, we settled on using the `findTransformECC` method from the OpenCV library, as it allows to tune many arguments and use a mask defining valid and invalid pixels. The method uses the Enhanced Cross Correlation (ECC) objective function.

Let’s look at some of the results! Here is the same scene as above with co-registered time-frames. You can now appreciate that small bridges and roads don’t fluctuate around any more.

Figure 5. Co-registered stack of Sentinel-2 L1C shown in Figure 1. Fluctuations visible around smaller features like bridges, roads and houses have been corrected for. Contains Copernicus Sentinel-2 data modified with Sentinel Hub 2022.

And here are some more examples from both Sentinel-2 and PlanetScope time series. Notice how the land changes over the time interval, i.e. March 2021 to August 2021 for both data collections.

Figure 6. Example of unregistered (left) and co-registered (right) stacks of Sentinel-2 L1C images for the analysed AOI. The time intervals intentionally span a large time interval displaying large land cover changes, as well as atmospheric conditions, such as haze, snow and clouds. Nevertheless, the co-registration algorithm converges to plausible solutions for all time frames. Contains Copernicus Sentinel-2 data modified with Sentinel Hub 2022.

Figure 7. Example of unregistered (left) and co-registered (right) stacks of PlanetScope (© Planet Labs 2022) images for the analysed AOI. The stacks include images from different satellites, as Doves and Super Doves, hence the visible difference in spectrum and spatial resolution. Nevertheless, the registration algorithm is able to robustly deal with such differences.

Looking at the registered stacks for both Sentinel-2 and PlanetScope you can appreciate that the translational displacements are correctly recovered, and that the remaining misalignments between frames mostly show non-rigid (local) components. If we look at the estimated translations, we notice that the distributions are not exactly aleatory, i.e. values do not seem drawn from a random normal distribution. Bear in mind that we are not looking at the real distribution of the circular error, but the one estimated by the registration algorithm, which introduces its own biases. Nevertheless, the values are distributed around (0, 0), which justifies the use of the average image as a noisy estimate of the true location.

Figure 8. Estimated shifts in pixels for Sentinel-2 (left) and PlanetScope (right). The correlations seen in the figures might be due to biases in the algorithm used for geo-locating the tiles and by our own image-based registration algorithm.

Looking at the distribution of the norm of the estimated shifts for Sentinel-2, i.e. the estimate for the circular error, we can see that it roughly resembles the values reported in the data quality report, shown in Figure 2 and 3. We also repeated the registration process for 2022 for the same AOI to see if we could observe a clear reduction in estimated circular shift due to the improved positioning algorithm. Results are shown in Figure 9, where indeed we see that the distribution of estimated circular errors is closer to 0 compared to the unrefined products.

Figure 9. Comparison of estimated circular shift for Sentinel-2 image for 2021 (top) and 2022 (bottom). From January 2022 the products are positioned with a more accurate processor, which improves the overall geo-location of Sentinel-2 products.

The results above show us the positive effects of co-registration. How about the effect for the temporal analysis in field delineation? Looking at the variations of the probabilities denoting whether a pixel belongs to an agricultural field or not, we can also see how co-registration has a positive effect.

Figure 8. Time series of the probability of the arrowed pixel being inside an agricultural field for the registered stack. Comparing with Figure 4 one can notice the reduction in fluctuations.

The following figure shows the sum of relative differences of the probability of each pixel belonging to a field for the unregistered and co-registered stacks. For both Sentinel-2 and PlanetScope the reduction in variability can be seen for the majority of pixels, meaning that we can reduce the variability due to geometric errors. The residual variability now better reflects the actual changes in the agricultural parcel.

Figure 9. Examples of reduction in variability of the pixel probability of belonging to an agricultural field for PlanetScope (left) and Sentinel-2 (right).. The values here shown are the sum of the absolute difference in values of the probabilities for unregistered (x-axis) and co-registered (y-axis) stacks.

As a consequence of the research presented here, we also updated the co-registration sub-package in eo-learn. The update simplifies the co-registration module to include only the ECC co-registration task that showed the best performance during our experiments. Some methods were removed. If these changes affect you, please get in touch with us.

Thanks for reading. Get in touch with us at eoresearch@sinergise.com for any question, comment, complaint or appreciation.

Bonus content

Here is a different way of looking at the temporal series of images, which highlights the effect of co-registration better than gif animations. If you imagine the time-series as a spatio-temporal data cube, we can look at the data by fixing one spatial dimension, e.g. cutting through the cube at a given northing/easting coordinate and looking at the slice, as we do with a slice of cake.

Figure 10. Temporal view of the unregistered (top) and co-registered (bottom) stack of images at a given northing coordinate.

As you can see, the positive effect of co-registration can be easily appreciated.

TThe project has received funding from European Union’s Horizon 2020 Research and Innovation Programme” under the Grant Agreement 101004112, Global Earth Monitor project.

How to Co-Register Temporal Stacks of Satellite Images

Bonus content

Published in Planet Stories

Written by EO Research