Introducing Intake-stac
We’re excited to introduce a new Intake driver: Intake-stac. We think this tool will make it much easier to explore SpatioTemporal Asset Catalogs (STAC) and enable interactive data analysis and visualization in a Python environment.
Intake-stac provides Intake drivers that support opening STAC Catalogs, Collections, Items and ItemCollections. By combining Intake and sat-stac, Intake-stac provides a simple toolkit for working with STAC catalogs and for loading STAC assets as Xarray objects. Intake-stac can be installed via pip or conda-forge:
$ pip install intake-stac
# or
$ conda install -c conda-forge intake-stac
STAC
The STAC specification provides a common, machine-readable (JSON) format for describing a wide range of geospatial datasets. STAC’s goal is to make it easier to index and discover geospatial assets. An asset is any geospatial dataset that can be described by a spatial extent and time. The STAC project is in a state of rapid development; approaching its 1.0 release and quickly finding adoption across the cloud-native geospatial imagery community. More on STAC, its specification, and its ecosystem of tools on the STAC website or in this introductory blogpost from Chris Holmes:
Using Intake-stac
While STAC provides a powerful and flexible standard for describing data, it doesn’t do much to load assets into memory for custom computations and analysis. That’s where Intake-stac comes in, providing a lightweight Intake driver that makes it easy to load data described in STAC catalogs. It does this by integrating functionality included in a number of familiar open source software tools such as Intake, rasterio, Xarray, and sat-stac. Although we highlight some of the functionality included in Intake-stac below, we recommend checking out the documentation site or this Binder for an interactive Jupyter notebook for more details.
In the example below, we use Intake-stac to open the “planet-disaster-data” STAC catalog:
Now we have an Intake Catalog where we can easily select a specific asset and load it as a Xarray Dataset.
Intake-stac also works well with the sat-search library, providing a powerful tool for dynamically searching, discovering, and loading data all in one place. Here we use sat-search to quickly identify all the Tier 1 Landat scenes within a bounding box before selecting one scene and loading it as an Xarray object.
Next steps and conclusions
We’re excited by what we can now do with STAC and Intake together. In the next phase of the project, we’re looking for feedback on a few things:
- First, we want to hear how people are using Intake-stac, what works and what doesn’t. Please get in touch on our issue tracker.
- Second, we are interested in exploring additional integrations with related projects; for example, we recently added some lightweight (experimental) support for interacting with GeoPandas and we’re considering how to enable Intake’s GUI browser for STAC data.
- Finally, we are starting to look into translating collections of assets into high-level data objects (i.e. Xarray Datasets). This already works when stacking multi-band images into a single DataArray but there is a lot of room to explore developing new representations of asset collections.
Thanks for reading and for trying out Intake-stac. Be in touch!
Acknowledgements
Building Intake-stac was a team effort. I want to specifically thank Matthew Hanson (Element84), Anderson Banihirwe (NCAR), Julia Signell (Anaconda, now SaturnCloud), Jonah Joughin (UW), and Scott Henderson (UW). The development of Intake-stac was supported in part by NASA-ACCESS grant #80NSSC18M0156.