Sat-utils: Find and use open satellite data

By: Matt Hanson

Open satellite data from NASA and ESA is some of the most valuable and extensive open data that exists. Our partners in development can get great value from this data, but find it hard to work with and gain insights. To help address this gap between potential and application, we’ve created a range of open tools such as sat-api, Label Maker, Libra, Libra Live, and landsat-util. Sat-utils is an important part of this effort, providing much of the underlying functions for searching, deploying geospatial APIs, and creating catalogs. Sat-utils are designed as modular building blocks that decision-makers can plug into their own satellite pipelines. Sat-utils is part of the technology behind some of our work mapping in Ethiopia and monitoring Special Economic Zones.

In keeping with that effort, we recently released new versions of sat-utils. A big feature of the new versions is that they utilize the SpatioTemporal Asset Catalog (STAC) specification. Development Seed has been a core contributor to STAC because we believe the community-driven process will lead to greater adoption by data providers and we facilitate the development of flexible tools that can be used across multiple data sources. Chris Holmes has written several posts covering the development of STAC, which serve as a great resource to start learning about STAC.

This post outlines the development of sat-utils, which is a collection of tools for easier discovery and use of geospatial imagery, starting with a focus on Landsat-8 and Sentinel-2 data.


Last year we released a version of sat-api that implemented a basic STAC (v0.5) API. However, that version didn’t include all the features in STAC, wasn’t compliant in some ways, and the STAC spec has changed (v0.6) since. Over the last few months, we’ve pushed hard at not just making sat-api a more fully compliant STAC API, but also on releasing a suite of other STAC-related utilities. For many end users, the most useful utilities presented here is sat-search, a client that can be used with STAC APIs, such as Development Seed’s sat-api endpoint.

Obligatory satellite image: June 30, 2018: Parco Regionale Veneto del Delta del Po, Italy (Landsat-8)

sat-api

Originally designed to stand up an API of Landsat-8 data, sat-api now serves a few purposes:

STAC API Reference Implementation — The sat-api project serves as a reference implementation of the STAC API specification. There are some differences (e.g., paging), which we aim to minimize by continuing to add features, or by proposing changes to the STAC specification.

Node Library for a STAC API — The sat-api repository contains NodeJS libraries for the API, an Elasticsearch backend, and ingestion of STAC static catalogs to the backend.

Deployment project —The sat-api-deployment repository is used for the deployment to the live API, including automatic building and deployment via CircleCI. It also can be used as a project template to an instance of sat-api on your own AWS account. The README includes detailed instructions on deploying your own instance as well as ingesting data into it. Since the live API can be updated with new versions, if you need a stable API for production you can deploy your own with this project.

Experimentation Platform — As the STAC specification evolves we have found it useful to implement new and proposed features. In fact, STAC development is mainly driven by implementations, rather than endless bikeshedding.

Live API of public datasets — Finally, it serves as a live API deployed at sat-api.developmentseed.org that currently includes the Landsat and Sentinel data in the Earth on AWS program.

sat-stac

While sat-api is more visible, it’s in NodeJS, while most of the other sat-utils are written in Python. The sat-stac library serves as the fundamental building block for STAC in Python because it defines the classes (e.g., models) that represent the different STAC entities: Catalogs, Collections, and Items.

A simple STAC static catalog

A STAC static catalog is a series of linked JSON files. Sat-stac facilitates the use of the links between files and can be used to read and write STAC catalogs, collections, and items, as well as download assets. It can be used to open a Catalog, Collection, or Item from any source (local file, static catalog URL, API endpoint) and traverse the catalog from that point. A basic command line tool provides a way to create catalogs and add other catalogs and collections to it.

The README covers use of the command line. The first tutorial covers creating and working with STAC static catalogs, while the second tutorial covers the use of the Python STAC classes.

sat-stac-landsat

The sat-stac-landsat repository is a util for creating a catalog of Landsat data from the original metadata. It contains functions for fetching and parsing the inventory of Landsat data on AWS, transforming its metadata to STAC Items, and adding those to a STAC static catalog. It is not meant to be used by end users, but instead is the library used to create and maintain this Landsat STAC Catalog. It can also be browsed using STAC-browser.

Additionally, there is a deployed Lambda function that processes new Landsat data as it comes in and published the entire STAC Item to an SNS topic that can be subscribed to: `arn:aws:sns:us-west-2:552188055668:landsat-stac`

sat-stac-sentinel

The sat-stac-sentinel repository does the same things as sat-stac-landsat, except is used for creating and maintaining this Sentinel STAC Catalog. It can be browsed and publishes to the SNS topic: `arn:aws:sns:eu-central-1:552188055668:sentinel-stac`.

sat-search

Perhaps the most useful of the sat-utils, sat-search is the util most likely to be used by an end user. While sat-api is the engine that allows access to the metadata, it can be unwieldy to construct queries when using geometries and complex queries. Most users will want a convenient and programmatic way of using the API and sat-search provides that. With it, via the command line tool or the Python library, users can easily query a STAC API endpoint, save the results and download specific assets.

$ sat-search search --intersects aoi.geojson --datetime 2018-01-01/2018-03-30 -p "eo:cloud_cover<10" --save aoi-results.geojson --print-cal

The --print-cal keyword will output a nicely color-coded calendar to the terminal, and the --save keyword will save the the Items and Collections as a GeoJSON FeatureCollection.

The aoi-results.geojson can now be used later, such as for downloading all the Red and Near-IR bands from the Items.

$ sat-search load aoi-results.geojson --download red nir --datadir '${date}/${collection}'

This will download just the two bands from all the scenes and save them in directories: first by date, then by collection. The files will be named by the ID of the Item, with a suffix indicating the asset name.

├── 2018-02-27
│ └── landsat-8-l1
│ ├── LC80100292018058LGN00_nir.TIF
│ └── LC80100292018058LGN00_red.TIF

Note that Sentinel data is in a requester pays bucket, so by default downloading these files will fail. To acknowledge you are paying the egress costs when downloading, use the --requestor-pays switch. You will also need to make sure the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are defined.

$ sat-search load aoi-results.geojson --download red nir --datadir '${date}/${collection}' --requestor-pays

This is by no means comprehensive. The README covers use of the sat-search command line tool, while the Notebook tutorial includes details on using sat-search as a Python library.

Tools for the community

We will continue to update and evolve sat-utils, including taking advantage of new STAC features as they are released. We hope that sat-utils will help to energize STAC development and utilization by providing a set of STAC compliant APIs and utilities that all work with one another. As a STAC 1.0.0 release gets closer we aim to get these utils more mature and stable so that they can serve as building blocks in other applications.

Open source projects thrive when there is active community involvement. If you are interested in improving open source utilities for processing open imagery at scale, comment or contribute to any of the sat-utils repos or ping Matthew Hanson, Sean Harkins, or Vincent Sarago on Twitter.