Sentinel Hub Blog
Published in

Sentinel Hub Blog

Multi-temporal Super-Resolution on Sentinel-2 Imagery

Deep dive into enhancing the spatial resolution using deep learning, including tips, tricks and technical details.

Intro

Data preprocessing

Download

  • Sentinel-2 bands (R, G, B, NIR) at 10m resolution for the period between May and October.
  • Deimos-2 bands (R, G, B, NIR) between July and October (period of availability) pansharpened and resampled at 2.5m spatial resolution.

Normalization

Mosaic of different Deimos tiles showing issues with clouds and different tile normalization.

Blow the clouds away

Cloud filtering on Deimos imagery is done by simple thresholding of the blue band.

Training set curation

  • cloud shadows in Deimos imagery
  • orthorectification on Deimos images can cause some parts of the image to match well with the corresponding Sentinel-2 image and others not so well
  • cloud shadows in Sentinel-2 imagery

Cleaning remaining noise

By plotting SSIM and PSNR values, we can identify problematic samples.

Model, finally

HighRes-Net architecture (source)
  • low resolution and high resolution images were acquired using different sensors, i.e. Sentinel-2 and Deimos-2, therefore the underlying acquisition parameters are different, e.g. spectral bands, point-spread functions;
  • the underlying objects being imaged, i.e. land cover, change between consecutive low-resolution frames and high-resolution frames. This is due to the revisit frequency of the Sentinel-2 images, the cloud presence and the overlap with Deimos-2 acquisition times.

Training

  • architecture optimisation, e.g. number of Sentinel-2 images, number of encoding layers, number of convolutional filters, pixel-shuffle or deconvolution upscaling layer;
  • band distribution of super-resolved images. We mentioned above that we are dealing with different input and target imaging sensors, i.e. Sentinel-2 and Deimos-2. Typically the output of the model would try to match the distribution of the target image, i.e. Deimos-2, but what we actually want is an image that matches Sentinel-2, so that it could be directly used in existing downstream applications;
  • loss function. This choice is paramount for the success of the SR task, but we soon realised how unsuitable the common pixel-wise losses (e.g. MAE, MSE, SSIM) are to truly quantify the quality of the super-resolved image. Furthermore, these losses can be completely unrelated to the downstream application that will make use of the super-resolved images.

Histogram matching

Green band distribution at the output of the model. The super-resolved distribution seeks to match Sentinel-2 as opposed to Deimos-2.

Perceptual loss

Example internal layer activations of the field-delineation model used for perceptual loss.
Comparison between Sentinel-2 and super resolved imagery.

Quantitative results

  • Our model is trained to match Sentinel-2 imagery, thus the bands have different distributions than Deimos (penalizing the validation scores).
  • The effect of temporal changes over the observed area and the differences in the acquisition times. Usually the difference between the time of the closest Sentinel-2 acquisition to the Deimos acquisition is in the order of few days — during which vegetation over the area can exhibit changes (natural growth, agricultural activity etc.). We know that the model matches the features of the latest Sentinel-2 image, but if they do not represent the state of the field that is present on the Deimos imagery, the computed scores do not reflect the information we want them to.
Comparison of super-resolution method (HRN) with the baseline (bicubic upsampling). Higher score is better.
Comparison of super-resolution method (SR HRN) with the baseline bicubic upsampling (S2 BIC) using two perceptual losses (feature and style). Lower score is better.
  • the sharpness increase can be particularly noticed for high contrast features like roads, but less so for low-contrast features like agricultural land;
  • the model is able to learn texture, e.g. of trees in forest. Unfortunately, it also learns to predict shadows due to low sun elevation angles, which are more prominent in Deimos-2 since the images were acquired on average 2 hours before Sentinel-2 images. This can be noticed when looking at taller structures like buildings, or trees. This is not a likeable feature, but is tied to the differences in acquisition parameters between Sentinel and Deimos. Ideally, for this task, one would want to minimize such differences;
  • the model takes as input a temporal sequence of cloudless Sentinel-2 images (i.e. 8 frames) that can span over multiple weeks. However, the model predicts super-resolved images that contain features from the latest Sentinel-2 image. This means that we can predict a super-resolved image for each cloudless Sentinel-2 image in a rolling-window fashion.

Impact on downstream applications

Field delineation Matthews Correlation Coefficient for extent, boundary and distance between the model trained on bicubically interpolated images (S2 BIC) and the one trained on super resolved imagery (S2 HRN)
Geometric metrics computed on the post-processed vectorized predictions for field delineation trained on bicubically up-sampled imagery (S2 Bicubic 4x) and on super-resolved imagery (SR HRN).

Do it yourself

--

--

Stories from the next generation satellite imagery platform

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
EO Research

A joint account for a team of data scientists from Ljubljana, Slovenia. Working with satellite imagery and developing Sentinel Hub applications at Sinergise.