Glaciers of the Everest region. NASA Earth Observatory image by Robert Simmon, using data from ASTER.

A Gentle Introduction to GDAL, Part 3: Geodesy & Local Map Projections

(If you’re new to GDAL, you might want to start with Part 1, which covers installation, gdalinfo, &gdal_translate, or Part 2 Map projections & gdalwarp.)

In the spring of 1802, the British East India Company embarked on a detailed, 5-year-long survey of the Indian Subcontinent. The plan was to measure the arc of the meridian (length along a line of longitude—in this case 78˚ east) from Cape Comorin (the southern tip of India) to the Himalaya.

45 years and 3 directors later the Great Trigonometric Survey (as the endeavor became known) reached the glacier-capped mountains. About a decade after that, in 1956, Andrew Scott Waugh declared Peak XV (known by locals as Chomolungma and the rest of the world as Mount Everest) the world’s highest. And in 1870 the first Index Chart to the Great Trigonometrical Survey of India was published.

1882 edition of the Index Chart to the Great Trigonometrical Survey of India, printed in 1885. Map courtesy Mountains of Central Asia Digital Dataset.

The Survey’s primary objective wasn’t to measure the altitude of the world’s highest peaks (although they more-or-less succeeded at that) or make a map of India (ditto)—but to measure the size and shape of the Earth itself. It turns out that not only is the Earth not flat, but it’s not exactly round, either. It’s what’s technically (and charmingly) known as an oblate spheroida ball with a bulge in the middle.

A map of the triangulations used to determine the height of Mount Everest and other Himalayan peaks. From “On Mounts Everest and Deodanga, Proceedings of the Royal Geographical Society of London, Vol. 2, №2 (1857–1858), pp. 102–115”.

By taking precise measurements along one very long north-south line, and comparing that length to equally precise east-west measurements, it’s possible to determine the shape of that squashed ball, a mathematical surface called a reference ellipsoid. And, in fact, the ellipsoid obtained by the Survey’s second director—George Everest—remains in use in South Asia and is even incorporated in a handful of EPSG codes.

The main cause of the imperfect shape of the Earth is simply rotation—it stretches a bit along the equator. This asymmetry is handled perfectly well by an ellipsoid—but there are also local variations, caused by differences in density and thickness of the Earth’s crust, and the water and ice piled on top of it. These variations change the overall shape of the Earth’s gravity field. A geoid accounts for these local variations in shape. (The geoid is sometimes described as mean sea level, but there are subtle differences which I don’t quite understand.) The gravitational discrepancies are generally quite subtle, but still large enough to measurably deflect a weight hung on the end of a string. This is why the Great Trigonometric Survey took so many decades—to be useful, the measurements needed to be incredibly precise.

It’s worth pointing out that shape in this context is the theoretical shape of the Earth if it was entirely covered in water—differences in elevation between mountains and valleys, plateaus and canyons, are defined relative to this surface.

Chimborazo Volcano, Ecuador. If surface elevation was defined relative to the center of the Earth rather than sea level, Chimborazo, which lies near the equator, would win by (more than) a mile. Drawing courtesy David Rumsey Historical Map Collection.

A geodetic datum is used to link a reference ellipsoid with precise coordinates for latitude and longitude, and there is a bewildering array of them ranging from continental (NAD27) to local (Old Hawaiian Datum). Each is optimized to minimize errors—which can be several hundred meters—for particular locations.

Prior to the Space Age datums were defined in isolation, and could only be joined by carefully matching the measurements from adjacent surveys—which ranged from extremely difficult to downright impossible. For example, traditional surveying techniques rely on line of site—if you can’t see it, you can’t plot it. This means that the coordinate systems for far-flung islands, or separate continents, could never be precisely aligned. During the Cold War the U.S. Air Force bounced radio waves off planes to link far-flung datums, but these efforts were soon supplanted by satellites. Many modern geospatial datasets are defined in terms of the World Geodetic System 84 (WGS84), which you’ve seen any time you’ve used a GPS.

By now you may be wondering why I’ve gone into a long and somewhat technical digression — I certainly was frustrated by the extended discussion of such things during my (extremely limited) formal training in GIS. It turns out that the differences between ellipsoids, geoids, and datums don’t matter so much at a global scale—but they do matter if you’re trying to drill an oil well, survey a property line, or drop a missile on someone’s head from the other side of the world. (A discouraging number of advances in cartography were developed for the military.) A working understanding of these concepts will help you understand some of the more esoteric aspects of manipulating maps and satellite data.

Crafting Local Maps

There are two fundamental ways to make a map of a specific region: start with a large dataset and cut out the bit you’re interested in, or build the map from smaller pieces. I’ll start with a large map, the high-res version of the Natural Earth raster dataset I used in part 2. If you’re not up for the 300MB download, just use the smaller one and change the file names to match.)

Transverse Mercator map of India. Made with Natural Earth.

I could just use gdal_translate with -projwin option to crop out a latitude and longitude, but, as I’ve discussed, cylindrical equirectangular isn’t a great projection for global maps, and it’s worse for most local and regional maps. Which begs the question—how to choose a better one?

Hacks borrow, artists steal.

If you’re making a map of a place that’s already been mapped by experts, just look up what they did and copy it! In this case, I just found a National Geographic wall map of India and read the projection off the description: Transverse Mercator. Professional cartographers will almost universally include projection and scale info on their maps. They also overlay maps with graticules (lines of latitude and longitude) so I estimated the proper extents from those. Here’s the code:

gdalwarp -t_srs '+proj=tmerc +lat_0=0 +lon_0=84 +k=0.9996 +datum=WGS84 +units=m +no_defs ' -te 66 6 100 41 -te_srs EPSG:4326 -ts 1400 0 -r bilinear NE1_HR_LC_SR_W.tif india_tmerc.tif

Most of this should look familiar. I’ve written out the target spatial reference system, -t_srs, since I couldn’t find a ready-made EPSG. Transverse Mercator is specified by +proj=tmerc followed by a few more variables:

+lat_0=0 +lon_0=84 +k=0.9996 +datum=WGS84 +units=m +no_defs

+lat_0=0 specifies the latitude of the origin (the Equator), +lon_0=84 specifies the longitude of origin—the only meridian that will be vertical (84˚). +k=0.9996 is a scaling factor that helps spread the distortion across the map. Set to 1, There’s no distortion along the central meridian, but distortion increases out towards the edges. With a value less than 1, there are two meridians on either side of the center with no distortion, and a little bit of distortion at the center and a little bit at the edges, but limits total distortion. It’s a compromise. I’ve defined the datum as WGS84 with +datum=WGS84, which matches the source dataset and prevents having to worry about datum switching, which can be painful. Finally +units=m defines the units (Transverse Mercator is a projected coordinate system so it needs linear units) and +no_defs prevents proj.4 from using defaults settings, which could theoretically cause problems.

The other interesting part of the command is this:

-te 66 6 100 41 -te_srs EPSG:4326

This sets the target extent in units defined by the spatial reference system specified with -te_srs, our good friend EPSG:4326, or simple latitude and longitude. (This is one of the new features in GDAL 2.) If you set -te without -te_srs you need to figure out what the boundaries of your map should be in the target SRS, which is often meters that are specific to an origin point that can be mysterious. This is often hard (for me, at least), so I find it easier to define my boundaries in latitude and longitude. The final few commands I’ve shown before:

ts 1400 0 -r bilinear NE1_HR_LC_SR_W.tif india_tmerc.tif

These set the target size in pixels (by setting height to 0 GDAL will automatically figure out how tall the map should be), resampling method to bilinear in the names of the input file and output file.

But what happens if you’re making a unique map, not of something that’s been done a million times before like a country or a province? How do you pick a map projection then? With the fantastic Projection Wizard, by Bojan Šavrič.

The Projection Wizard interface.

All you have to do is drag a box around the area of interest, select the type of map you want; equal-area, conformal, or equidistant; then hit the “PROJ.4” link for the projection you want (there are options), which will spit out a text string to drop in -t_srs. So good.

+proj=eqdc +lat_1=38.02777777777778 +lat_2=38.47222222222222 +lon_0=-109.875

My only quibble is that the interface uses degrees, minutes, seconds, and GDAL uses decimal degrees. You may want to trim those to round numbers, and likewise for the extents. Like so:

gdalwarp -t_srs '+proj=eqdc +lat_1=38.025 +lat_2=38.470 +lon_0=-109.875' -te -110.5 37.75 -109.25 38.75 -te_srs EPSG:4326 -ts 1400 0 -r bilinear NE1_HR_LC_SR_W.tif NE1_HR_LC_SR_W_canyonlands_eqdc_1400.tif

Oops. Not very sharp, is it?

A map of the Canyonlands that doesn’t have remotely enough resolution. Made with Natural Earth.

It’s not that useful to make a large-scale (local) map with a small-scale (global) dataset. Looks like I need another data source. Conveniently, there’s a high-res (1:100,000-scale) version of Natural Earth covering the United States. Not so conveniently, it’s really, really big—4.72 GB big, to be precise. To save everyone a long download I’ve cropped it to the Canyonlands and chopped it into pieces (download here), which, not coincidentally, also allows me to demonstrate gdal_merge.py.

gdal_merge is a helper utility written in Python—it’s not a core part of GDAL. In fact, Frank says he wrote it as a demo (which is why the feature set may seem limited and syntax inconsistent) but people found it useful so it stuck around. And yes, it is useful. Unzip the download, navigate to the natual_earth_100k_canyonlands directory in your command line, and run the following command:

gdal_merge.py -o canyonlands_merged.tif *.tif

That’s it. gdal_merge.py invokes the script, -o canyonlands_merged.tif specifies the name of the output file, and *.tif is a wildcard that opens up every file in the directory ending with .tif. What gdal_merge doesn’t do is any type of reprojection (but it can crop with -ul_lr and resize with -ps ). So the output file is Web Mercator, just like the input files. Equidistant conic is more appropriate for this region, so use the same gdalwarp command as before (with one change) to reproject the data:

gdalwarp -t_srs '+proj=eqdc +lat_1=38.025 +lat_2=38.470 +lon_0=-109.875' -ts 1400 0 -r bilinear -dstalpha canyonlands_merged.tif NE1_HR_LC_SR_W_canyonlands_ne_eqdc_1400.tif
Map of the southeastern Utah, from the 100-meter resolution Natural Earth of the United States, equidistant conic projection.

I’ve omitted the te and te_srs options—gdalwarp is smart and matches the extents of the output file to the input file, which can be convenient, and results in the subtle cuve along the edges of the map above. To match the boundary of the original (blurry) map, use:

gdalwarp -t_srs '+proj=eqdc +lat_1=38.025 +lat_2=38.470 +lon_0=-109.875' -te -110.5 37.75 -109.25 38.75 -te_srs EPSG:4326 -ts 1400 0 -r bilinear canyonlands_merged.tif canyonlands_ne_eqdc_te_1400.tif
Map of the Canyonlands, from the 100-meter resolution Natural Earth of the United States, equidistant conic projection.

Looking carefully, it’s evident that even this map isn’t quite detailed enough to display at this size—it needs a slightly higher-resoluton data source. In the not-so-distant past, you’d probably be limited to 7.5 minute USGS topographic maps (or their international equivalents), or custom made maps for specific locations (like U.S. national park maps). But in the past decade or so there’s been an explosion of mapping on the web, both commercial (Google Maps, MapBox) and open-source (Open Street Map). Typically they’re limited to display in a browser, or on a mobile device (that’s what these maps are made for, after all).

Georeferencing Web Tiles

The maps you see on the web are composed of many different tiles, each 256- by 256-pixels, scaled to fit the display size. They’re not individually georeferenced, but each fits into a system that allows their relative scale and placement to be derived.

A typical web-tiling scheme, from zoom level 0 to 4. Terrain tiles ©Stamen Design, cc by s.a. 3.0.

The most common web mapping tiling scheme consists of a a zoom level, an x coordinate, and a y coordinate. Zoom levels typically range from 0 (156,412 meters per pixel, global) to about 19 (about 0.3 meters per pixel—as detailed as the highest-resolution unclassified satellite data). At higher zoom levels, the map is subdivided into two equal vertical and horizontal slices. Every slice is given an x and y coordinate, starting with 0 in the upper-left-hand corner. Mapbox has an excellent and much more detailed description of how web maps work.

This consistent scheme provides a mechanism for GDAL to decode, and the ability to convert web tiles into a georeferenced file. The code to generate a map of the Canyonlands is deceptively simple:

gdal_translate -projwin -110.75 39 -109 37.5 -projwin_srs EPSG:4326 -outsize 4096 0 frmt_wms_stamen_terrain_tms.xml canyonlands_terrain_4096.tif

It looks just like a normal use of gdal_translate, but instead of pointing to an image, it’s pointing to an XML file:

<GDAL_WMS>
<Service name="TMS">
<ServerUrl>http://b.tile.stamen.com/terrain-background/${z}/${x}/${y}.jpg</ServerUrl>
</Service>
<DataWindow>
<UpperLeftX>-20037508.34</UpperLeftX>
<UpperLeftY>20037508.34</UpperLeftY>
<LowerRightX>20037508.34</LowerRightX>
<LowerRightY>-20037508.34</LowerRightY>
<TileLevel>18</TileLevel>
<TileCountX>1</TileCountX>
<TileCountY>1</TileCountY>
<YOrigin>top</YOrigin>
</DataWindow>
<Projection>EPSG:3857</Projection>
<BlockSizeX>256</BlockSizeX>
<BlockSizeY>256</BlockSizeY>
<BandsCount>3</BandsCount>
<ZeroBlockHttpCodes>302</ZeroBlockHttpCodes>
<Cache />
</GDAL_WMS>

A full description of the process is in the GDAL Web Map Services documentation, but the basic idea is to open up the XML file in a text editor and point the <ServerUrl> line at the map server hosting the tiles you want to download, stitch, and georeference. (I also added a little buffer around the edges with -projwin -110.75 39 -109 37.5 so my subsequent reprojection step wouldn’t be cut off at the edges.)

There are several examples using different types of map servers in the documentation, but if you are viewing a slippy map you can often right-click (command-click on a mac) on the map and it will give you the option to open the image in a new tab or window — that will give you the URL to paste into <ServerUrl>. Replace /terrain/ with /watercolor/ to generate stylized maps. You can access Open Street Map tiles by changing the URL to http://tile.openstreetmap.org/ and file format from .jpg to .png:

<ServerUrl>http://tile.openstreetmap.org/${z}/${x}/${y}.png</ServerUrl>

BTW—Stamen’s map tiles are built on Open Street Map and licensed under creative commons—make sure you follow any licensing requirements for the data you use, and please don’t abuse your access by downloading tens of thousands of tiles.

The final step is to convert the map from Web Mercator to equidistant conic with what should be familiar gdalwarp command (although I admit to need to have the documentation open more often then not to make sure I get the syntax right):

gdalwarp -t_srs '+proj=eqdc +lat_1=38.025 +lat_2=38.470 +lon_0=-109.875' -te -110.5 37.75 -109.25 38.75 -te_srs EPSG:4326 -ts 1400 0 -r bilinear canyonlands_terrain_4096.tif canyonlands_terrain_eqdc_1400.tif
Map of the Canyonlands, using terrain tiles ©Stamen Design, cc by s.a. 3.0. Notice the presence of fine details missing in the Natural Earth map above. In fact, one could argue there is an overwhelming amount of detail. One important part of making maps is finding the balance between precision and readability.

With these tools the wide variety of government and open-source data available, I hope you’ll be able to get started making your own maps. But what if you want to go beyond idealized base maps, and explore a photo-realistic view of the world, or show change over time? That will be the topic of my next post—processing satellite data, including Planet, Landsat, and Sentinel, with GDAL.

Further Reading

  1. A Gentle Introduction to GDAL
  2. Map Projections & gdalwarp
  3. Geodesy & Local Map Projections (you are here)

Special thanks to Frank Warmerdam and Tim Schaub for proofreading and technical support.