Sentinel-1 Technical Series Part 1 | Burst Mapping: Random Seeks into PB Scale Non Cloud-Optimized Data Archives

Published in

descarteslabs-meditations

6 min readAug 25, 2022

Welcome to Part 1 of a technical series focusing on Descartes Labs’ global SAR processing capabilities. SAR provides a valuable remote sensing tool, and this series will dive into detail about how we process SAR data globally and build SAR and InSAR-derived products. An overview can be found here.

Introduction

SAR datasets are notorious for their relatively complicated data distribution formats, projection systems and need for custom processing tools to transform them into analysis-ready products. Until a few years ago, SAR data processing was a specialty of niche academic and engineering groups, largely due to limited availability of this type of data and associated restricted use licenses. The Sentinel-1 SAR constellation, part of Europe’s Copernicus Earth Observation programme, has been a game changer in this regard. Data acquired by this constellation has been freely available for end users since Sep 2014, enabling local and global scale application users to reliably access and analyze SAR imagery over their AOIs.

At Descartes Labs we often build and deploy models to support applications like agriculture, deforestation and change detection that use data sourced from multiple types of sensors, including SAR. While Sentinel-1 data is open and freely accessible, there are still challenges in efficiently operationalizing the use of this dataset.

Successfully addressing those challenges means being able to access Sentinel-1 SAR datasets efficiently, repeatedly and with a granularity similar to that found in the use of Cloud Optimized Geotiffs (COGs) with optical imagery.

In this post, we will walk through the process of building our fast data access mechanism for Sentinel-1.

A collection of bursts

Sentinel-1 images most land masses in the Interferometric Wide Swath Terrain Observation with Progressive Scans (IW-TOPS) imaging mode. While the mission does use other imaging modes over land, those other modes represent a tiny fraction of total data volume acquired, so we will focus on IW-TOPS mode. The concepts presented here in the context of IW-TOPS mode are also relevant to the less frequently used Extended Wide Swath (EW-TOPS) imaging mode imagery.

Sentinel-1 imagery is released in two flavors by the European Space Agency (ESA):

Single Look Complex (SLC) granule

This represents imagery at the highest ground posting of ~14m along track x ~5m cross track.
A single IW-TOPS granule is a collection of images, also known as bursts in radar imaging terminology. The images themselves are on a uniform along track time by slant range grid.
Bursts are grouped by imaging swath and one TIFF file per polarization and swath is included in the granule. The image arrays are concatenated back-to-back in these TIFF files.
Images are distributed as complex numbers — I and Q bands, and can be used for both radar backscatter and interferometric applications. Imagery corresponding to each recorded polarization is distributed in a separate TIFF file.
A typical granule covers an area of 200km along track x 240km cross track, whereas a single burst covers an area of about 21km along track by 80km cross track.

Ground Range Detected (GRD) granule

This represents a simplified radar backscatter image at a posting of 10m along track x 10m cross track
A GRD granule is generated by mosaicking individual bursts, incoherently aggregating the data in slant-range direction (multi-looked in radar terminology) and resampling onto a uniform along track time by ground range grid to generate a GRD granule.
Images are distributed as single band TIFF files. A GRD granule contains one TIFF file per recorded polarization.
GRD products cannot be used for interferometric applications, as phase information is discarded during incoherent aggregation.

*Each Layout of Sentinel-1 SLC (Left) and GRD (Right). A single SLC granule is a collection of individual images called bursts, that are mosaicked into a single contiguous GRD product. Images courtesy* *ESA*.

At Descartes Labs, we use interferometric as well as amplitude-based change detection methodologies with SAR data and so prefer SLC data over GRD data. Moreover a lot of caveats associated with GRD data like single-pixel gaps and edge effects are also resolved by starting from SLC data. The rest of the discussion will focus on SLC data.

Building a global burst map

One of the important features of Sentinel-1 SLC data is that the footprint corresponding to an SLC granule is not fixed over time. However, the footprints of the contained bursts are stationary, thanks to the burst synchronization feature of the mission. These burst footprints repeat almost exactly and can be used as a basis to organize temporal stacks of Sentinel-1 imagery, similar to Landsat’s Path-Row or Sentinel-2’s tiling scheme.

Orange and Purple polygons represent individual burst boundaries in SLCs acquired from the same relative orbit on different passes. Burst boundaries line up very well despite different sets of bursts being packaged on different passes.

Descartes Labs built the first version of its global burst map using the radar metadata in annotation files from all Sentinel-1 SLC imagery from Sep 2014 to July 2020. Following this analysis, we adopted a consistent naming scheme which includes relative orbit number, swath and time since ascending node for labeling the footprints as well as individual burst SLCs (see Reference below). Our burst databases are dynamic products that get updated automatically with new footprints and burst SLCs as imagery from ESA is ingested by the system. Descartes Labs uses these Sentinel-1 burst databases to orchestrate its global scale radar backscatter and InSAR analytics pipelines efficiently and cost-effectively. ESA released an official map of Sentinel-1 burst footprints in June 2022 and since then, we have mapped our footprints to these IDs allowing us to work with our databases using our naming convention as well as ESA’s.

*Sentinel-1 burst footprints over Brazil. A single footprint roughly spans an area of 20km x 80km.*

Rapid access to a single burst

Labeling all the global Sentinel-1 data lets us organize our workflows, but doesn’t speed up extraction of bursts from within TIFF files contained in SAFE zip archives. To address data extraction costs, we adopted techniques used in the neuroimaging community to build tools to enable random access within large zip archives.

Note that the following approach requires we scan the entire zip file once. We store the state of the zip decoder at various locations corresponding to individual bursts along with the radar metadata as part of our rapid access mechanism. With this setup, we are able to access radar metadata and associated imagery for any burst and any polarization, while only reading the relevant portion of the zip file. Note that these lookup mechanisms can be used with any store of Sentinel-1 SLCs (as long original zip files from ESA have not been modified). We are able to pull any burst (~80 MB compressed) from SLCs in Descartes Lab’s storage buckets in 2–3 seconds and from NASA’s Alaska Satellite Facility DAAC in 6–7 seconds. This rapid data access mechanism lets us process backscatter and InSAR products on the live stream of SLCs from ESA very efficiently. This also provides us with the capability to test new analytics methods — especially interferometric and polarimetric methods at scale over large regions or time spans.

Rapid SLC data access has been identified as a bottleneck by various SAR user groups, including the Copernicus Land Monitoring Service (See Appendix F) and Descartes Labs has built a scalable solution to address this. Some common use cases that our access mechanism already addresses are:

Access to co-pol imagery without having to transfer associated cross-pol imagery as well. This cuts down data transfer by half for interferometric applications.
Ability to pre-determine burst footprints that don’t contain land and completely avoid transferring that data for certain applications, further cutting down data transfer by 30%.
For wide area analysis, this eliminates expensive and time-consuming data replication into shared file systems or large scratch disks, and enables access to SLC data as part of the processing workflow without delay.
Processing tasks can be executed on smaller, cheaper nodes that do not require reading an entire SLC SAFE file. This enables finer-grained workers and increased parallelism.

Conclusion

In this blog post, we described the fast data access mechanism built and used by Descartes Labs for handling Sentinel-1 SAR data. This fast data access mechanism makes it easier to repeatedly access and work with SAR imagery, ultimately allowing more efficient analyses and model deployment. In the next post, we will dive deeper into geocoding of Sentinel-1 bursts, which is a common preprocessing step powering all our backscatter and interferometric pipelines.

Interested in using this technology?

Contact our team to discuss how we can integrate our SAR fast data access mechanisms into your processes.

Reference:
Agram PS, Warren MS, Calef MT, Arko SA. An Efficient Global Scale Sentinel-1 Radar Backscatter and Interferometric Processing System. Remote Sensing. 2022; 14(15):3524. https://doi.org/10.3390/rs14153524