GeoParquet 1.0.0 Released

Radiant Earth
Radiant Earth Insights
3 min readSep 18, 2023

By Chris Holmes, Product Architect at Planet, Board Member at Open Geospatial Consortium, and Technical Fellow at Radiant Earth

The GeoParquet community is pleased to announce the release of GeoParquet 1.0.0. This is a huge milestone, indicating that the format has been implemented and tested by lots of different users and systems and the core team is confident that it is a stable foundation that won’t change. More than 20 different libraries and tools support the format, and hundreds of gigabytes of public data are available in GeoParquet, from a number of different data providers.

Why GeoParquet?

I gave a good bit of the backstory in the beta.1 announcement, but the main driving push has been to settle on one standard way to encode geometries in the Apache Parquet format. The immediate goal has been to enable spatial interoperability among the set of modern data science tools (BigQuery, Snowflake, Athena, DuckDB, etc.) that leverage Parquet to great effect and increasingly have geospatial support. Though most of those do not yet support GeoParquet it is likely on many of their roadmaps, and providing the stable base of 1.0.0 should make it even easier for them to adopt.

But in the meantime, GeoParquet has emerged as just a great geospatial format, with support in many geospatial libraries and tools, that I think has the potential to be a core Cloud-Native Geospatial distribution format and a go-to for any day-to-day geospatial work.

Faster and Smaller

The core reason it’s becoming everyone’s favorite new format is that it’s simply faster and smaller than the competition. I wrote a blog on some testing I did exploring write performance and file size, and intend to make some testing tools for read performance as well. For those that don’t want to read the full post, a typical file size comparison is:

The main reason for this is that Parquet is compressed by default. The other formats can be zipped up, but then they aren’t actually usable until you unzipped them. GeoParquet’s speed is also quite impressive compared to other formats, mostly due to the fact that it’s a columnar format instead of a row-oriented one, and has a large ecosystem of tools that have really optimized its performance.

The GeoParquet Ecosystem

I think the most impressive thing about GeoParquet is how robust the ecosystem has become, before we even got to 1.0.0. I fully believe this is just the start, and that in no time at all it’ll be weird for a geospatial tool to not support GeoParquet, and many non-geo tools will have it as their only native geospatial option. I’ll do a full post on the amazing ecosystem soon, but you can get a quick sense from the list of tools and libraries on geoparquet.org:

We’re also starting to see data providers like Microsoft, Maxar, Planet, Ordnance Survey and others put new data in GeoParquet. And the community is also converting a number of interesting large scale datasets like the Google Open Buildings and Overture Maps data to GeoParquet on Source Cooperative.

What’s next?

The release of 1.0.0 is truly just the beginning. We’re taking it through the full Open Geospatial Consortium’s standardization process, as we’ve started forming an official GeoParquet Standards Working Group. We hope to move through the standardization process relatively quickly, to become an full official OGC Standard.

There is also a lot of activity on the GeoArrow specification, which will form the basis of a columnar geometry format for GeoParquet 1.1.0. That has a lot of potential to make the format and tools around it even more performant.

We’re excited to see this community grow, with more data, more tools, and more innovation. Please help us by converting data into GeoParquet, demanding GeoParquet from your data providers, and building tools to use it. And let us know when you do, so everyone can keep track of the growth of this exciting community.

Originally published at https://cloudnativegeo.org on September 18, 2023.

--

--

Radiant Earth
Radiant Earth Insights

Increasing shared understanding of our world by expanding access to geospatial data and machine learning models.