Intensely-colored wildflowers covered the Carrizo Plain in the spring of 2017. Photograph ©2017 Steve Corey. cc by-nc-nd 2.0.

A Gentle Introduction to GDAL Part 4: Working with Satellite Data

Robert Simmon
Apr 23, 2017 · 11 min read

Maps are great for plotting a route, finding yourself when you’re lost, exploring a distant land, or discovering relationships between areas, but they’re more-or-less by definition a static, idealized depiction of place. What happens if you want to see the surface of the Earth in near-real time, watch the seasons change, or study a long-term trend?

Launch a satellite! Or, failing that, download some data from someone who has.

A very wet winter brought an incredible blanket of flowers to California in the spring of 2017. These Planet RapidEye images show how the flowers transformed Carrizo Plain National Monument. (Lindsay Hoshaw of KQED found these data in the Planet archive & wrote about the super bloom — thanks!) ©2017 Planet Labs Inc., cc-by-sa 4.0.

The images above, from Planet’s RapidEye constellation, demonstrate one of satellite data’s greatest strengths: observing change. Orbiting robots can monitor the Earth 24/7 (although this depends on the orbit, there are many options), 365.25 days a year. They also have super-powers, like being able to eat sunlight, infra-vision, perfect memory, and virtual time-travel (some weather satellite datasets go back to the early ’60s.) This allows unparalleled study of seasonal and long-term change.

On the flip side, they can also be inscrutable, like an Alan Moore superhero. The data are rarely packaged neatly, and even when they are it can take some work to make a usable image (or set things up for processing by a computer). Which is where GDAL comes in. Again.

A typical workflow to create an image from raw satellite data would be:

  1. Download data.
  2. Re-order or assemble bands into the desired order (red, green, blue; or near-infrared, red, green; etc.)
  3. Increase the resolution with pan-sharpening, if desired.
  4. Contrast-stretch and color-correct the imagery, either algorithmically or by hand.
  5. Restore georeferencing information, if necessary.
  6. Crop, re-project, and re-size image to merge with other data.

Here’s how to do it.

Using gdal_translate to Re-order Bands

Let’s start with the data that went into the March 31 image of the Carrizo Plain. Download a single scene here, or sign up for a Planet Explorer account. (It’s free, and you’ll be able to access PlanetScope and RapidEye data from California. Navigate to Carrizo Plain National Monument (about -119.725, 35.120) and download the RapidEye ortho tile with this ID: and its neighbors.) Now open the file with your TIFF viewer of choice.

Blue, green, red RapidEye image of the Carrizo Plain. ©2017 Planet Labs Inc., cc-by-sa 4.0.

Oh. That doesn’t look right.

Fire up to figure out why.

Which gives a block ’o text, ending with:

There are 5 bands! RapidEye detects light in 5 separate wavelengths: blue, green, red, red-edge, and infrared (full image specs are in the Planet Imagery Product Specification (PDF)). Each is assigned as a band in the image. We can see the first 3 (the additive color primaries!), kinda see red-edge, and decidedly can’t see near infrared. Read Why is that Forest Red and that Cloud Blue? (which I helped create in my previous gig at NASA) to learn more about what these extra wavelengths (and more) are for.

You may have noticed I listed the bands as blue, green, red… That’s because they’re ordered from short (blue) to long (infrared) wavelengths. It’s just a convension in the remote sensing community, although there are exceptions (*cough* Landsat 4, 5, & 7 *cough*). They really shouldn’t be displayed as an RGB image at all, but 5 separate grayscale channels. To reorder the bands and view the image correctly, run with a few new options:

Adding the option assigns band 3 to blue, band 2 to green, and band 3 to red. enables an efficient lossless compression scheme to keep the file size from getting outrageous, and ensures any image viewer will display the bands as red, green, blue (instead of an alpha channel or something).

Unstretched red, green, blue RapidEye image. ©2017 Planet Labs Inc., cc-by-sa 4.0.

Better! (No really, it’s better.) But still dark. Time for again, but this time with to compute image statistics and see why it’s so dark.

Which should spit out:

There’s a few important things in here: and the for each band. means the data is in unsigned 16-bit integer format—twice as many bits, but 256 times the information as a normal 8-bit image channel. Theres 65,536 possible values in each band. This gives a lot of flexibility in image processing at the expense of a little bit of storage. The other value of the interest is the for each channel which are 14,22–41,645 for blue, 2,108–49,122 for green, and 3,482–49,572 for red. That’s not bad, considering the available range in a 16-bit file is from 0–65,535. So what’s wrong, and how can the image be made better?

Algorithmic Image Enhancement

Since our eyes sense light proportionally, the data still need to be stretched to compensate, despite filling most of the available range. Like so:

This does two important things. stretches each band equally from from a range of 1,422–49,572 (first pair of numbers) to a range of 0–65,535 (second pair of numbers). I could have scaled each band separately to its extents (which is an extremely common image processing technique) but that would likely engender a hue shift. Equal scaling leaves things looking more natural. raises each band to the power of ½—the square root. It’s a quick and dirty way to get a preview image.

Histogram stretched and square-root scaled red, green, blue RapidEye image. ©2017 Planet Labs Inc., cc-by-sa 4.0.

As I said, this is quick and dirty, and is going to make every scene look different. Color correction of satellite imagery is a vast topic. I’ve written two guides, one with Landsat 8 and one with PlanetScope data—they’re the process for both works fairly well for RapidEye. Tom Patterson of the National Park Service and Mapbox have both written guides that I recommend, too.

That’s an RGB image—how is the infrared band used? By convention the longest wavelength (furthest in the infrared) replaces the red band, while the other (shorter wavelength) bands are pushes towards blue. A “standard” false-color infrared image is near-infrared=red, red=green, and green=blue. There’s only 3 bands to play with, so blue goes away entirely. Make a false-color image like this:

Histogram stretched and square-root scaled near-infrared, red, green RapidEye image. ©2017 Planet Labs Inc., cc-by-sa 4.0.

I’ve done the band re-ordering & scaling with one line of code (having run on the original 5-band file to get the range).

Combining Bands with

So far so good. These images show the Carrizo Plain in the relatively recent past (at least at the time I’m writing this)—what did the region look like last year, after a merely rainy (not biblically wet) winter?

Digging through the Open California RapidEye archive you’ll find—nothing. Well, not exactly nothing. Clouds. Not everywhere is as cloudy as Borneo, but clouds are still surprisingly common, with about 2/3 of the Earth’s surface covered at any given time. Fortunately, the RapidEye satellites aren’t the only ones in the sky, and Landsat 8 collected a nice, cloud-free scene of the area just over a year before, on March 25, 2016.

There’s roughly a dozen different Landsat viewers out there (and you can add Planet Explorer to the list), but the most straighforward for retrieving a single scene might be the AWS Landsat archive (if you want to use Earth Explorer, the official USGS access point, I wrote a guide back when it was the only option). Navigate to LC80420362016085LGN00 to download the data from last spring (learn how to decode the scene IDs here).

Wow. That file list is a mess. Instead of having multiple bands in a single file, each band is stored in its own TIFF, and there’s a bunch of other files with cryptic extensions like “IMD” and “OVR”. Just ignore them, and download the TIFFs for band 2 (blue), band 3 (green), band 4 (red) and band 8 (panchromatic).

Combining the separate bands is just as straightforward as re-ordering bands, but uses instead of .

The flag followed by the filenames for the bands we want stack the channels in the order they’re written. The rest of the code I’ve described before: calls the script and creates the output file (be careful: It will overwrite existing files!) The command options and make sure TIFF readers interpret the colors correctly and that the file is as small as possible without throwing away any data.

By the way, GDAL is very flexible about the order of commands—it usually doesn’t matter. I try to be consistent just to keep myself sane, but don’t always manage it (either consistency or sanity.) The order of the bands after does matter, however. For most conventional band combinations, make sure longer wavelengths are listed first.

Here’s the output file after color correction:

True-color Landsat 8 image collected on March 25, 2016. Data courtesy NASA/USGS Landsat.

Landsat scenes cover a much wider area than a RapidEye tile. You can see all the way from Vandenberg Air Force Base (where Landsat 8 was launched!) to Oxnard, California. But Landsat has a lower resolution—30 meters per pixel instead of 5 meters per pixel. This is a common tradeoff in satellite imaging: area covered vs. resolution. Landsat 8 and several other satellites (the SPOT series, Digital Globe’s constellation) carry a special band, the panchromatic band, to get extra resolution from their sensors.

Pan sharpening essentially swaps the luminance information in a low resolution full-color image with a high resolution grayscale image. Since our eyes are more sensitive to changes in luminance than changes in color, a pan-sharpened image is almost as good as a full resolution multi-spectral (the fancy term for color) image. Here’s how you apply pan sharpening with, available in GDAL 2.1 and later (the writing of this tutorial was in no way affected by my UNIX environment’s insistence that python try to import GDAL 1.11):

Like most GDAL commands, it may be intimidating at first but is fairly straightforward once you get the hang of it. The script requires the input and output files to be in a specific order: panchromatic band, followed by the multispectral image, ending with the output filename.

Pan-sharpening a Landsat image bumps the resolution from 30 meters per pixel (left) to 15 meters per pixel (right). Images based on data courtesy NASA/USGS Landsat.

It’s also possible to generate a pan-sharpened image from multiple separate bands, instead of a pre-combined file, in which case the command would look like:

Longer to type, but ultimately fewer steps.

Unlike many other GDAL scripts, uses a smoothing resampling algorithm by default (cubic), but I often use instead because it creates images suitable for additional custom sharpening.

Pansharpening works best when the panchrimatic band spans similar wavelengths to the multispectral bands. Some older satellites like Landsat 7 and Ikonos had a lot of infrared in the panchromatic band, so they always looked a little funny. If you’re working with these data try out the somewhat esoteric -b (bands) and -w ( weights) options. Landsat 8 and most high-resolution satellites have panchromatic data purely in the visible spectrum, and look great out of the box.

Now that I have images from similar dates in two different years, how do I compare them? More GDAL magic, of course. That’s what you’re here for, right?

The Landsat scene and RapidEye tile have both different extents (the area covered by each image) and resolutions, and both need to be matched. I’ll start with the extents, using to create a shape filea file format that stores vector information, in this case the corner points of the RapidEye image.

Once there’s a shapefile that defines the desired shape use with the option to crop the pan-sharpened Landsat file and (target resolution in the units defined by the coordinates system of the file (meters in the Universal Transverse Mercator projection used by both Landsat and RapidEye)) to resize to match RapidEye’s resolution:

This creates a perfectly matched pair of images.

Although there were some flowers in the spring of 2016, they can’t hold a candle to 2017’s bumper crop. Images based on data courtesy NASA/USGS LAndsat (left) and ©2017 Planet Labs Inc., cc-by-sa 4.0.

Last but not least, here’s a trick to add georeferencing information into a file that doesn’t have any. This is especially useful if you process images in Photoshop (without Avenza’s Geographic Imager, which I generally like) or other photo editors that don’t natively support GeoTIFF. Just remember to keep the original image size. To strip the headers, open one of the GeoTIFFs in Photshop or another image viewer and re-save it. (Or download this color-corrected copy of the March 25, 2016, Landsat scene.)

First, you’ll need to install a helper python script that’s not included in many (most?) default GDAL installations, . The script is hosted on Github, and should be downloaded into the directory that your command line environment stores programs. In my case: . To make it run navigate to that directory and type: .

To restore georeferencing run (make sure to use the right file name if you created your own TIFF):

This copies the information from into . Confirm with , which should print a bunch of text listing the coordinate system, etc. that looks similar to what I showed way back in part 1. Maybe rename the file from -nogeo.tif to -geo.tif, just to keep track.

Hopefully this is enough info for you to get started working with satellite imagery. There’s a large (and growing!) collection of data from free and commercial sources, of ever-increasing accessibility and quality.

My next post will show some examples of using GDAL to translate and crop vector data with , and some data processing techniques. with .

Planet Stories

Using space imagery to tell stories about our changing…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store