Tapalcatl 2 Updates

4 min readNov 14, 2018

As I’ve gotten deeper into prototyping, I’ve learned and realized a few things since my previous post proposing Tapalcatl 2.

First, it totally works. Using an AWS Lambda server with a block size of 256KB results does what you’d expect and is fast enough for me to be happy (it can run faster with more memory allocated, or outside Lambda, or with a shared cache of central directories).

Second, archive creation is a total pain. Once I can transcode existing sources (soon), I’ll be able to do a better job of demonstrating how it works. There’s a proof-of-concept renderer that’s part of a Marblecutter-powered project I’ve been working on recently

I’ve refined a few concepts since last time in response to conditions on the ground.

Metatiles and Materialized Zooms

Previously, metatile corresponded to the number of tiles in an archive at the base (highest zoom) of a sub-pyramid and dictated the number of overview levels that would be present in an archive. This had the side-effect of dictating which zoom levels archives would be generated for (the top (lowest zoom) of each sub-pyramid).

As I worked on the Tapalcatl 2 Storage Calculator, I realized that this complicated a couple things: the top of the archive’s pyramid (0/0/0) wouldn’t match that sub-pyramid’s expected zoom (which might be negative) and it forced all variants / formats contained to have the same max zoom, lest things become inconsistent.

In an effort to be more explicit, I replaced metatile as an indirect structural constraint with the notion of materialized zooms, which can be used to look up archives directly.

Simply put, materialized zooms are the zoom levels at which one can expect to find archives. With a materialized zoom list of [0, 4], one can expect to find sub-pyramids at each tile coordinate at zoom 0 and zoom 4. 0/0/0, 4/0/0, 4/1/0, 4/1/1, 4/0/1, 4/{x}/{y} (1 file for zoom 0, 256 for zoom 4). 0/0/0 will contain tiles for zooms 0–3. Each of the zoom 4 archives will contain tiles for zooms up to the maximum zoom available for each format / variant.

Since they’re explicitly specified, this means that materialized zooms don’t need to have a uniform depth between them; the number of intermediate zooms in archives can vary according to data density.

Freeing metatiles from influencing the zooms at which individual archives allows us to re-use the metatile concept in a slightly different way, as a knob that can further control the object count / size balance.

Rather, let us define metatile as the number of tiles on a side at the top of the sub-pyramid. Thus, rather than producing 256 files at zoom 4 (see above), this number can drop to 16 with a metatile size of 4. Each archive will contain 16 (2⁴) sub-pyramids, increasing the size of individual zips while decreasing the overall object count.

(Tapalcatl 1’s concept of metatiles now matches and can be assumed to materialize all zooms. The 2 remaining differences are how scales are handled and tiles named within each ZIP.)

The beginnings of a spec, on GitHub!

In the interest of creating a living document to refer to (vs. my rambling prose), I’ve created https://github.com/mojodna/tapalcatl-2-spec.

If you’d like to provide specific feedback, it’s probably easiest to comment directly on the commit responsible for the bulk of the first draft.

Implementations

I’ve started to flesh out the list of implementations:

Proof-of-concept Python renderer: https://github.com/mojodna/marblecutter-land-cover/blob/master/landcover/tools/render.py
JavaScript library (currently read-only, with block-level caching made possible by yauzl): https://github.com/mojodna/tapalcatl-js (also available on npm as tapalcatl)
tilelive module: https://github.com/mojodna/tilelive-tapalcatl (also available on npm as tilelive-tapalcatl and suitable for use with tl and tessera)
A stripped down tilelive-based server for use with AWS Lambda: https://github.com/mojodna/lambda-tileserver

I’m actively working on write support for tapalcatl-js so that I can use it with tl to transcode MBTiles archives (and other tilelive-compatible sources) and put them online for experimentation.

Future work

Multiple variants

The project I’ve been working on that’s driven the Python implementation only includes a single variant (but multiple formats and scales), so I’ve only done preliminary sketches on how that affects metadata. You can see those in Proposed Tapalcatl 2 Metadata.

Vector tiles

Another open question I’ve talked about periodically with people (including folks behind the original Tapalcatl implementation) is how vector tiles relate to the notion of scale.

<vector>@2x differs from <vector> in the level of curated detail included (particularly accounting for feature detail vs. label sizing), but does that mean it should be used to render <zoom> + 1 (covering 256𝗑256 points at whatever scale) or <zoom> at twice the resolution?

My inclination is the latter, but conversations suggest that I’m almost certainly over-simplifying.

Hashes and header overrides

These are things I haven’t encountered a need for yet, so they remain in the category of open questions. Hashes continue to feel straightforward (first 5 characters of MD5("{z}/{x}/{y}"), see any issues with that?) and header overrides as a JSON object (or a list, to handle repeated headers, cf. Server-Timing?), written as file comments in the ZIP’s central directory.

Service Worker support

Reading Tapalcatl 2 Archives with Fetch demonstrates (or demonstrated; bundle.run seems to be having problems with some of its dependencies) that partial reads of remote Tapalcatl 2 archives is feasible. The next step is to ensure that tapalcatl-js can run in a browser (presumably with webpack) and wire it up as a Service Worker.

OSMesa support

One of OSMesa’s (err, FireHOSM) capabilities is to render vector tiles (and derivatives) from OSM data at a global scale. I find myself shying away from doing this whenever it comes up, knowing that writing to S3 (or worse, deleting tiles when I discover that the data is wrong) will take a disproportionate amount of time.

Being able to produce the same coverage with fewer objects (especially when tiles remain individually addressable using lambda-tileserver) will increase the likelihood (along with time and budget) that these artifacts become more generally available, enabling something like a global version of Explorable Detroit.

fin.

If any of this is particularly interesting or useful to you, please let me know, either in a comment or by contacting me directly.