Building a fast on-the-fly tile server with GeoTrellis
The monitoring platform
Geoalert has recently launched a new urban monitoring platform that uses satellite imagery to collect data about any region on the Earth. It aims to keep things simple for the user and requires a minimum of input — all you need to do is draw an area of interest and press the button. But under the hood the data has a long way to travel through stages of the pipeline. One of the challenges that we faced was reducing processing time for improved user experience.
In this article we will focus on our raster tile server which is used to display satellite imagery on the map. For this purpose we are using GeoTrellis from Azavea — a fast geographic data processing engine written in Scala. Initially we took a traditional approach which includes a preparatory stage of ingest where the source image is cut into pyramided tiles and indexed with a Z-curve or other spatial index method. Serving tiles then is merely a process of selecting the target tile and sending its content over network. However, ingest is a resource-consuming operation and adds up to the overall pipeline processing time. So eventually we came up with an idea to skip the stage of ingest and serve tiles on-the-fly from a one-piece image.
For this to work well, the source raster has to be a Cloud Optimized GeoTiff (COG), which implies that it has an internal tile layout and an appropriate set of overviews. To render a tile from a COG we only need to read a small portion of the image. Ideally, we could extract the requested tile from a COG just as if it were a pre-ingested layer. In reality though, if COG internal layout doesn’t match map layout, we will still need to read some excessive amount of data and perform resampling on-the-fly.
Below is the code we used to build our first prototype. This article contains only the most relevant snippets of code, but a complete sample project is available in this GitHub repository.
This worked, but didn’t live up to our expectations performance-wise. It ran smooth and fine for tiffs up to 300 MB, but as we started to feed it with larger files, the throughput significantly decreased. Strange though, in theory we wouldn’t expect any correlation with the file size, right? After all, for a tile of a certain size we need to read and process a certain amount of data — regardless of the file size. That’s true, but only if we know where the specific portion of data resides in the file.
A little research revealed the following. When asked to read data from a COG, GeoTrellis starts off by collecting info about the internal structure of the tiff. This metadata, stored in the
GeoTiffInfo class, allows to efficiently navigate over the COG and retrieve data via lazy chunks of bytes.
Typically, to display a raster layer on a map we need to process a whole bunch of http requests for all the tiles within the area. The problem is that
GeoTiffInfo is collected and allocated in memory for every such request — even though we are querying the same file over and over again. Besides, I did some profiling and found out that it takes GeoTrellis more effort to collect this metadata than to actually read pixel values and render a tile. The larger the tiff, the bigger the difference.
We decided to cache
GeoTiffInfo and find some not-so-ugly way to make GeoTrellis reuse it. Or even better — cache the entire
GeoTiff class which is built on top of
GeoTiffInfo. Please don’t imagine that we are up to loading the entire tiff into memory, because
GeoTiff will rely on lazy byte segments to fetch data from the image. For this purpose we created a custom implementation of
For the sake of brevity we won’t focus on how the
Cache class is implemented. You may use Guava or any other third party library. In my sample GitHub project I implemented a minimalistic cache on top of
LinkedHashMap — but for the real thing you will likely want something else.
GeoTiff is not thread safe, because it references a stateful
ByteReader that is used to load lazy data.
ByteReader must not be operated concurrently. This means that if your app employs parallel threads to process http requests (and chances are that it does) you will have to maintain a pool of caches rather than a single cache. Again, to keep things simple I just wrapped the cache into
ThreadLocal and let every thread use its own instance. A proper production-ready design requires a more sophisticated solution — e.g., a pool.
Now we are satisfied with our tile server. In fact, given that the cache is hot, it is fast enough to compete with the more traditional “static” version. The performance remains constant regardless of the COG size. The main benefit that we managed to extract is reducing data preparation time to zero while retaining a good tile serving speed.
When to use this approach
The suggested approach will be a good fit if you don’t want to spend time and resources on layer ingest. A required prerequisite is that your source imagery is a tiled tiff with some overviews. Also, keep in mind that when a tile is requested from a new tiff for the first time, the cache is cold and it will take longer to complete the request. This could be a problem for large tiffs (like 10 GB or more).
What could be improved
You must have noticed that for multithreading purposes we are creating a number of identical cached
GeoTiffs just to guarantee that no
ByteReader is accessed by threads concurrently. It would be more efficient to reuse a single instance but supply a separate
ByteReader for each thread. Currently the design of
LazySegmentBytes class leaves no way to achieve this.
There’s another thing. GeoTrellis provides
GeoTiffLayerReader trait which relieves us of the necessity to write most of the code in
CogService.scala. Also, it represents a higher level of abstraction and provides an easy way to query a directory (or bucket) of unstructured tiffs rather then a single tiff. But, due to the particular design,
GeoTiffLayerReader doesn’t seem to be willing to make friends with our
Both of these concerns may be resolved if GeoTrellis team provides an out-of-the box support for caching COG metadata. We are going to propose these changes; I will update in case of any progress.