The State of STAC talk and Sprint #3 recap
Following up on the STAC 0.6.0 release, I just wanted to share a talk that I gave three months ago. And I figured I’d also include the recap on STAC Sprint #3 that I meant to write up months ago.
The talk was given as a keynote to the full Satellite Data Interoperability Workshop, so the majority of the audience was actually Analysis Ready Data people who weren’t deep in the weeds with STAC. This meant I focused on a higher level overview of the spec and the community, recounting how we’ve evolved and where we are at. Or at least where we were at 3 months ago, as there has been a ton of progress since then. The video should be embedded below:
You can also see my slides directly on google docs, and feel free to reuse and share — the talk is CC-BY 4.0 licensed.
STAC Sprint #3 Recap
With the third STAC Sprint we really hit our groove with advancing the specification. The first sprint was amazing, but was mostly laying the groundwork of what we were even doing together. The second one brought significant advances, but only one day was fully dedicated to STAC, and we still were figuring out how to work together.
Unfortunately for this sprint we have less notes to point to than in the previous sprints. Part of that is that I didn’t have the time to write up the recap when it was fresh in my mind. But the larger reason is actually a positive one, which is that for the first time the majority of the work happened inside the github repository — making issues, editing documents, writing code and merging pull requests. This makes it easier for anyone to follow the evolution of the specification in one place, instead of having to track down different repositories with notes.
We had about 22 participants in the sprint, and it was a great mix of those who had taken part in one or both of the previous sprints, along with a number of new faces. We had representatives from SpaceNet, Azavea/Raster Foundry, CBERS, DigitalGlobe, Harris, Planet, Development Seed, Element84, Hexagon, Radiant Earth Foundation, PCI Geomatics, UC Davis, Boundless, OpenAerialMap, Astraea, OpenEO, Descartes Labs, GeoScience Australia and Vulcan. So it was a really broad group bringing a number of diverse experiences.
As in the past, we tried to move out in parallel, breaking into a few different groups, coming together periodically to work through bigger pieces together. Overall the format worked pretty well, and I believe we accomplished more by splitting up than we could have in one big sessions. You can see the full agenda and groups in the community sprint repo.
The Static STAC group worked through a number of issues together to tighten up many different aspects of the specification, including how to approaches STAC Items that are derived from others. Interestingly the STAC API group came up with the /stac/
catalog endpoint, which collapses the dichotomy between ‘static’ and ‘api’ versions of the specification. So we likely will not divide along those lines in future sprints. They also improved the process of editing the OpenAPI docs and introduced CircleCI for continuous integration.
The Client & Testing Tools group worked on a few different things, but the most notable was a validation engine. SparkGeo has continued to evolve that work after the sprint, and it now runs as part of the CircleCI setup. They also just went even further and released STACLint for online validation. The Collection Level Searching group drafted the new STAC Collection Spec, a new part of our mini-suite of specs, to describe a set of related STAC Items. It is also used independently, to simply describe collections of geospatial data, even if they are not represented by STAC Items. And I was part of the Website & Outreach group, and I was quite happy with the progress we made towards a nice website to explain STAC to newcomers. We also had people sprinting on their individual implementations of STAC, giving us feedback as they hit problems.
Overall we made great progress, and it was awesome to connect in person. It was also really nice being co-located with the Analysis Ready Data workshop, as we had nice intermingling between people at the breaks and in the evening events. I’m not sure when we will have the next in-person Sprint. I am really excited that we are making progress on the spec by just working online and in calls, so we might experiment with a ‘virtual sprint’. But nothing can beat the connections that can be made in real life, so I’m sure we’ll organize another one before too long.