sat-api: an API for SpatioTemporal Asset Catalogs
Development Seed’s sat-api is an open-source library for creating a RESTful API to search catalogs of geospatial data. Historically, different imagery providers have had their own names for describing the data (metadata) and different APIs for accessing them. This has made the reuse of software difficult, and analysts jobs harder, as each source of data needs to be dealt with individually.
We created the first sat-api to facilitate access to Landsat-8 and Sentinel-2 in a consistent way, allowing reuse of software to search or access it. However, without a standard to follow we made decisions that worked for Landsat-8 and Sentinel-2, but not necessarily other types of geospatial data.
Then, last October after SoTM 2017 in Boulder, the first SpatioTemporal Asset Catalog (STAC) working group convened, and has since grown, with the latest major release being v0.5. STAC is an initiative to define:
- a set of standard metadata fields describing geospatial data
- a flat file catalog structure used to describe a group of items (e.g., images) in a catalog
- a RESTful API for querying geospatial data
Prior to STAC, users searching for geospatial data had to use a separate approach for every provider and reconcile different formats. With STAC we hope to to get data providers to make STAC flat catalogs or APIs available for their data in order to allow for the reuse of client tools and a better user experience. Some of the major providers of satellite imagery such Planet and Digital Globe have already shown tremendous interest in the effort. Development Seed is also working on providing STAC flat catalogs for Earth on AWS datasets.
Sat-api now STAC flavored
The STAC initiative has come at a good time as we were working on a a new version of sat-api, and it was logical to make sat-api a reference implementation of the spec. We’ve been working at refactoring sat-api and are happy to announce this new version of sat-api, one that is compliant with the STAC spec (+ EO and Collections extensions).
Sat-api is a Node-based library that is deployable on AWS. We also maintain a deployed instance of it located at sat-api.developmentseed.org that contains the entire catalog of Landsat-8 and Sentinel-2 scenes on AWS. Going forward, we will keep sat-api up to date with the latest changes in STAC to make it as compliant as possible. This means there could be breaking changes, so we encourage users to deploy their own version of sat-api if they desire control over updates or need reliability. Users can also deploy their own versions to provide a STAC API to their own new or derived datasets.
The main search endpoint for a STAC API is /search/stac, and it returns a GeoJSON FeatureCollection of STAC `items`. Querying the sat-api search endpoint without any parameters will match all items in the catalog and return the first result. The assets field is collapsed and will be discussed below.
The size of a STAC catalog can become very large when used with most satellite imagery datasets. This occurs because metadata fields are often duplicated across the millions of scenes that are available. To prevent this duplication sat-api also implements a STAC extension called Collections. A STAC `collection` can contain any core or extension metadata that might appear in a STAC `item`. The difference is that a `collection` also has a collection field as it’s name, and a STAC `item` can reference the collection through the collection property and link. The metadata in a STAC `collection` applies to all `items` that belong to that `collection`.
Here is the landsat-8 collection that the item above is a member of.
The assets field contains references to the actual data, in this case http URLs. For each asset there is a key, which is a short unique identifier for the asset. “B1” here refers to the GeoTiff containing band 1, “ANG” is a metadata file contain detailed ephemeris data. For some of the assets, there is also an eo:bands field. This is a list containing references to the `eo:bands` field (shown above in the landsat-8 collection) that are contained in the file. This is how we can determine that B4 is the red band, B3 green, etc. without referring to outside documentation. In this example, each data file only contains one band, but other data source may contain multiple bands within a single file.
What’s next ?
We are currently working on improving the documentation and deploy process of sat-api to make it easier to use and more configurable so users can easily create a sat-api for their own data sources. We are also working on some client tools like sat-search, which will be the topic of my next post.
If you have an interest in contributing to sat-api please stop by the repo. We are actively looking for people to contribute, whether through testing, writing documentation or tutorials, or contributing the code base.