Making Sense of Satellite Data, An Open Source Workflow: Stitching Data with QGIS

Not all satellite data is formatted like Landsat and Sentinel-2—many commercial sources provide data in a single file with multiple bands, instead of the one band per file. One example is Planet’s high-resolution SkySat data. At 80 cm per pixel SkySats show details far sharper than existing non-commercial data, so it’s fun to work with.

Boston in the summer. SkySat data collected on July 30, 2017. Image ©2018 Planet Labs, Inc. CC-by SA 4.0.

Planet’s provided some sample data if you fill out a contact form (don’t worry, we don’t bite) so go ahead and grab some — I chose Boston — the site of OpenVisConf from 2013–17, but take your pick. If you don’t want to sign up you can grab an handful of tiles of NAIP data from Earth Explorer.

Got your data? Good. Let’s get started.

Open QGIS (I provided some installation advice in Part 2) and add a new raster layer: Layer → New Layer → Add Raster Layer… Click the teensy-tiny  button to open a system open file dialog, navigate to your data (It’ll be in skysat-high-resolution-sample-cities/pansharpened if you downloaded the Boston imagery. There should be 13 separate TIFFs.) and select all the files that end in pansharp.tif. Finally, click Add.

This will load the data into a QGIS project, with everything positioned properly. [QGIS is a geographic information system (just like it says on the tin) so the files aren’t just placed relative to one another, but also relative to the world—so you could combine different types of data together if you wanted.] Like so:

SkySat

Huh. Probably not what you were expecting. There’s two things going on here.

  1. Each scene (SkySat data is served as individual scenes that together make up a collect) is contrast-stretched separately by QGIS. That means the darkest and lightest pixels in each separate chunk of data are displayed as black and white, instead of the darkest and lightest pixels out of the entire collect. This gives a patchwork appearance.
  2. The bands are in the wrong order. By convention satellite data is ordered from shortest wavelength to longest; but image files are ordered red, green, blue.

I suggest combining the separate scenes first, which makes consistently stretching the data easier. Select Raster → Miscellaneous → Merge which brings up yet another dialog box:

With the default settings the only thing you’ll have to change is the Output data type which you should set to from Float 32 to UInt16. Like Sentinel-2 and Landsat 8, SkySat data is 16 bits per pixel, and no negative values (what’s a negative brightness?), so is stored as unsigned 16-bit data. (An example of a signed dataset would be a digital elevation model, which can have negative values for regions below sea level.) Make sure Place each input file into a separate band remains unchecked. Again, there’s a teensy tiny  button next to Input Layers click it then press Select all and OK to (tautologically) select all the scenes. Finish with Run in Background.

When QGIS finishes processing the Close the dialog box and you should see the merged data—now seamless—in your project window. You may have to hide (uncheck) the individual scenes to see the combined collect.

Getting there. Right click on Merged, select Properties and then Style to bring up the Layer Properties Style dialog, which should look familiar if you read Part 2.

This flavor of SkySat data isn’t calibrated, so you can’t simply scale between known quantities like the last time. Instead use QGIS’s auto-stretching function. Expand Min / max value settings to reveal the options for custom stretches. Cumulative count cut will allow you to choose how the darkest and brightest data points relate to black and white in the rendered image. The default values 2% and 98% will make the darkest two percent of the pixels black and the brightest two percent of pixels white, which throws out a lot of data. I usually pick something like 0.1% and 99.9%. Or even just select the Min / max option. For good measure set the Accuracy to Actual (slower) since modern computers are pretty fast.

Since you’re going to continue to process the data outside of QGIS it’s best if the black and white values for the three channels are matched, so set the black to the minimum value of the three, and white to the maximum value. Finally, we need to flip the bands—set the Red band from Band 1 to Band 3, leave the Green band as Band 2, and set the Blue band from Band 3 to Band 1. Ignore band 4—which is near-infrared data—for now. Depending on the exact settings you choose, you’ll see something like this:

Like with Sentinel data, right-click on Merged (or go to Layer → Save as) and then set Output mode to Rendered image (this exports the scaled image displayed in QGIS, not the original 16-bit data), Format to GeoTIFF (which should be the default), and provide a file name.

The exported file is now ready to import into GIMP, Photoshop, or the photo-editor of your choice.

Before moving on to image processing, I’ll demonstrate how to make a false-color image—substituting near-infrared data for the red channel, bumping red to green, and green to blue. (By convention and following color infrared photography the longest wavelength in a false-color image is red, middle wavelenght is green and shortest is blue.) In the Style dialog set Red to Band 4, Green to Band 3, and Blue to Band 2 which looks like this:

Weird (trees are bright red) but useful if you’re studying vegetation or water.

Alrighty then—now you should be ready to put the finishing touches on your data, which I’ll describe in part 4, Color Correction with GIMP.

Making Sense of Satellite Data: An Open Source Workflow