COG Talk — Part 1: What’s new?

This blog is the first in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

Vincent Sarago
Development Seed

--

remotepixel-tiler uses rio-tiler to dynamically create Web Map tiles from Landsat-8 data hosted on AWS.

For more than a year, we’ve been working on building out a suite of tools to make Cloud Optimized GeoTIFFs (COGs) easy to work with. Today we are excited to announce we are releasing version 1 of rio-tiler and rio-cogeo 🎂!

Both modules are:

  • well tested
  • actively maintained
  • support python 2 and python 3
  • easy to install (thanks to rasterio wheels)

COGs — The Basics

Let’s start with a quick refresher on the COG specification:

COGs are powerful because of how the data is structured internally. If done properly, the data can be accessed via HTTP range requests, meaning you can read only a small portion of a file instead of downloading the whole thing. This matters because the size of an individual block of data within the image can be small and easy to download with a simple GET request. To enforce this, COGs bigger than 1024 pixels by 1024 pixels have to be internally tiled.

The metadata header has a specific structure (by construction) and holds the Image File Directory (IFD) of each data block (internal tile). The IFD is critical to a COG, because it holds information (TileOffsets and TileByteCounts) about each internal tile. This means that by fetching only the first few bytes of the data we can then construct an internal map of the data.

The other (optional) feature is the overview. By adding internal overviews (reduced resolution versions of the raw data), we can now preview the data using fewer range requests.

Refs: https://github.com/cogeotiff/cog-spec/blob/master/spec.md

Rio-cogeo

$ pip install rio-cogeo~=1.0

While Cloud Optimized GeoTIFFs are beginning to see wider use, the creation of such files can still be a tricky process and when we started working on rio-cogeo there wasn’t an easy standalone solution. The goal was to build a simple yet powerful CLI to create and validate COGs.

COG creation

BEFORE# Add overviews 
$ gdaladdo in.tif
# Enforce internal tiling, add compression and re-organize internal structures
$ gdal_translate in.tif cog.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE
NOW - with rio-cogeo$ rio cogeo create int.tif cog.tif

rio-cogeo does the exact same thing as the GDAL commands (creating overviews, tiling and compressing) but it also provides seven different profiles to help the user choose the best configuration for their needs. Each profile can be extended using the --co options.

Web Optimized COG (WOG?!)

One important feature we found valuable to add was the --web-optimized options, which enables the creation of a web-tiling friendly COG. This aligns the internal tiles with the web mercator grid and overview levels match the standard slippy map zoom levels. This is similar in concept to mbtiles with the advantage of allowing fast remote access of partial data reads.

Interpretation of the specification

While rio-cogeo respects the COG specifications, by default this plugin enforces features like:

  • Internal overviews (User can remove overviews with option --overview-level 0)
  • 512x512 px internal tiles (can be overwritten with --co options)

Example

rio-cogeo has a nice CLI but it can also be used directly inside your own scripts. Checkout sentinel-2-cog to see how you could convert the whole Sentinel-2 Catalog for $90K ( link)

COG validation

The other feature we wanted to add was a validation option. Until now, people have had to rely on downloading the standalone script validate_cloud_optimized_geotiff.py or using Radiant Earth’s hosted version. Now you can easily validate with a single command:

$ rio cogeo validate cog.tif 

Rio-tiler

$ pip install rio-tiler~=1.2

Before creating rio-cogeo we started working on rio-tiler, a library to improve the ability to visualize COGs available on AWS Public Datasets. rio-tiler is generally used as part of a web map server to dynamically generate a map tile from an underlying COG source file (rather than generating them beforehand). Initially the library was built for specific satellites ( Landsat-8, then Sentinel-2 and CBERS-4), but it can now be used with any COGs.

Get Mercator tile from a cloud hosted file

Features in rio-tiler~=1.0

  • Rasterio 1.0
  • Support for Landsat-8, CBERS-4 and Sentinel-2* AWS Public dataset
  • Better image encoding using GDAL (previously done using Pillow)
  • Colormap for output tile image (rio-tiler can apply pre-defined or custom colormap on the output tile image)
  • Expression support for band ratios (e.g request expr=((b1-b2)/(b1+b2)))
  • Statistical functions (get min/max/histogram)

(see Changelog)

*sentinel-2 data on AWS is stored as JPEG2000 and in a requester-pays bucket. User will need to assume cost for each tile request (link).

rio-tiler plays a major role in most of our dynamic tiler related projects. With the release of 1.0, we have a solid baseline to move forward with other features, so stay tuned and subscribe to the rio-tiler repo to follow our progress.

Community

Thanks to our friends at Mapbox, we agreed to move rio-tiler and rio-cogeo to a new organization: cogeotiff. We’re partnering with Chris Holmes to build out an ecosystem of open source tools around COGs and we welcome anyone who’s interested in contributing to reach out on Twitter or comment on our repos.

Further reading

--

--

Vincent Sarago
Development Seed

Making COG at @DevelopmentSeed & Creator of @RemotePixel