COG Talk — Part 2: Mosaics

This blog is the second in a series called COG Talk, which looks at ways to use Cloud Optimized GeoTIFF, and why we use them.

Vincent Sarago
May 23 · 7 min read
Multiple high resolution Cloud Optimized GeoTIFF hosted on OpenAerialMap (link)

COG vs Map Tiles

Cloud Optimized GeoTIFF files, as the name implies, are specifically designed for easily accessing remote raster data. Because of the internal tiling and internal overviews, people often ask: can COGs replace map tiles? The usual response is: yes, but

rio-tiler-mosaic

Pixel selection

Creating map tiles using COGs means we are dynamically generating the image array in response to a tile request. When working with mosaics, it also means we need to choose which pixel we display when we have an overlapping dataset. For a given tile, we iterate over all intersecting input images to decide which pixel to choose from. By default, rio-tiler-mosaic provides four different pixel selections rules:

  • Brightest: loop through all images and return the highest pixel value
  • Darkest: loop through all images and return the lowest pixel value
  • Last: take the pixel in the last matching image and return when the tile is full
Pixel selection methods applied on Landsat-8 NDVI values for all 2018 observations over Montreal area.

Smart multi-threading

For each pixel selection method, we need to either process the whole stack of images or just the first few until the array is full. We are using multi-threading to fetch and read data in parallel to speed up the process. Because the First and Last pixel-selection methods should return as soon as the tile is totally filled, we implemented a partial multi-threading approach by processing chunks of assets in parallel instead of the full list (see code). This is particularly handy when the list of assets is long.

Usage

The mosaic tile handler is designed to roughly match rio-tiler’s tile handlers and returns tile and mask arrays. Here is an example of generating a mosaic tile.

rio-tiler-mosaic takes a list of image files (assets), a tiler handler(customor from rio-tiler) and Web Mercator XYZ indices as input.

mosaicJSON

In addition to rio-tiler-mosaic, today we are releasing a specification for representing Web Mercator mosaics constructed from multiple observations. The mosaicJSON specification can be seen as the GDAL VirtualRaster ( VRT), but for indexing files by Web Mercator quadkeys. Quadkeys (a contraction of quadtree and key) are "one-dimensional strings" representing unique Z-X-Y (level-row-column) Web Mercator map tiles.

COG footprint and Quadkey index
  • simple JSON format (enabling high ratio compression)

Specification

https://github.com/developmentseed/mosaicjson-spec/blob/master/0.0.1/Readme.md

Example of implementation

Here is a simple implementation of mosaic tiling using the specification. On each tiler's call, our handler ( my_mosaic_handler) function fetches the mosaic file and looks for the assets indexed by the Web Mercator parent tile (at minzoom level) for the input XYZ tile. We then use the rio-tiler-mosaic to combine the different image assets into one tile.

mosaic handler: get asset list for a specific ZXY tile request.

Creating a mosaicJSON definition

Now that we know how to use it, let’s create a new mosaicJSON definition. Let’s say we have 28 files (e.g. Facebook population density) covering almost the whole continent of Africa.

High-resolution population density maps COGs from Facebook AI (link)

Quadkey base zoom (or mosaic min-zoom):

To construct the spatial index we need to define a minimal set of quadkeys. By definition, all COGs intersect with the tile 0-0-0 (zoom 0) but we can do a little math to minimize the set further. Inspecting the input TIFs, we see that the native resolution is around 0.00027 degrees with 8 levels of overviews. The native resolution corresponds to Web Mercator zoom level 12 so we can guess that we should start tiling at minzoom = 4 (12 - 8 = 4). We'll use this as the base zoom for quadkey keys.

# Create Footprint$ parallel -j4 rio bounds ::: $(aws s3 ls opendata.remotepixel.ca/facebook/ --recursive | grep ".tif" | awk '{print "s3://opendata.remotepixel.ca/"$NF}') | jq -c '.features[0]' | fio collect > facebook.geojson# Find ourZoom 4 quadkeys(there are 13) $ cat facebook.geojson | supermercado burn 4 | mercantile quadkey | paste -s -d"," -0330,0331,1220,1221,0333,1222,1223,3000,3001,3010,3002,3003,3012
Zoom 4 Mercator tiles intersecting with the facebook population dataset.
Python code sample showing how to create a mosaic definition file.

cogeo-mosaic: a CLI and a Serverless stack to create and use mosaicJSON

https://github.com/developmentseed/cogeo-mosaic
$ pip install https://github.com/developmentseed/cogeo-mosaic$ cogeo-mosaic create my_list_of_files.txt -o mosaic.json

Is mosaicJSON appropriate for all mosaic map tile problems?

Obviously not, but we have been testing this solution on different projects and we find it simple and fast in most cases. That said, here are the pro/cons:

# Get number of tiles from zoom 4 to zoom 12 using https://github.com/mapbox/supermercado/pull/26$ cat facebook.geojson | supermercado burn 4..12 | sort | uniq | wc -l  613 456
  • mosaicJSON files can be relatively small (especiallyif using compression) and can be cached to increase performance.
  • Tile creation can take several seconds when using darkest/brightest methods (becauseit has to read all the COGs).
  • Only supports theWeb Mercator projection (as for rio-tiler).
  • Generating mosaicJSON files can be slowfor large areas with many images.

Community

Both the mosaicJSON specification and rio-tiler-mosaic are published on Github and we welcome any feedback and/or contributors. Please feel free to ping me on Twitter @_VincentS_ if you have questions or want to hear more about the work we are doing to make open data more open and easier to use.

Demo

We built a simple demo page where you can explore some mosaicJSON examples and even share your own: https://bl.ocks.org/vincentsarago/raw/815884188c243b636ab8d927d8942a4d/

Hundreds of mercator tiles created dynamically based on mosaicJSON definition.

Development Seed

To understand a changing planet we create, analyze and distribute massive amounts of data

Vincent Sarago

Written by

Creator of @RemotePixel

Development Seed

To understand a changing planet we create, analyze and distribute massive amounts of data