The DownLinQ
Published in

The DownLinQ

The SpaceNet 5 Baseline — Part 1: Imagery and Label Preparation

Preface: SpaceNet LLC is a nonprofit organization dedicated to accelerating open source, artificial intelligence applied research for geospatial applications, specifically foundational mapping (i.e. building footprint & road network detection). SpaceNet is run in collaboration with CosmiQ Works, Maxar Technologies, Intel AI, Amazon Web Services (AWS), Capella Space, and Topcoder.

There is still plenty of time to get involved with the SpaceNet 5 Challenge that seeks to determine route travel times along roadways directly from satellite imagery. In support of this rather complex challenge, this post walks readers through the steps necessary to prepare the data for the first step in our baseline: creating training masks for a deep learning segmentation model. Code to reproduce the processes detailed below is available in our CRESI github repository.

Accessing SpaceNet data is free, and only requires the creation of an AWS account. To begin with, we’ll download data for both SpaceNet 3 and SpaceNet 5. An example download command is shown below (see for further instructions).

The City-scale Road Extraction from Satellite Imagery (CRESI) framework was designed to extract roads and speed estimates at large scale, but works equally well on the smaller image chips of the SpaceNet 5 Challenge. To run CRESI, you will need docker (ideally the nvidia-docker version) installed on a GPU-enabled machine. All CRESI commands should be run within this docker container.

A. Download:

B. Build docker image:

C. Create docker container:

D. Attach docker container:

SpaceNet 3 and SpaceNet 5 data formats are slightly different, due to extra post-processing performed on the SpaceNet 5 imagery. The RGB 3-band pan-sharpened imagery (PS-RGB) for SpaceNet 3 was distributed in the native 16-bit data format. In SpaceNet 5 the RGB 3-band pan-sharpened imagery utilized the Maxar DRA (Dynamic Range Adjusted) product which seeks to equalize color scales, and yields an 8-bit image. The pan-sharpened 8-band multispectral images (PS-MS) are prepared the same for both challenges, so we will utilize this data for training and testing.

While we lose a significant amount of information by utilizing only a subset of the multispectral bands, for ease of exploration we extract the 8-bit RGB imagery from the 16-bit multispectral imagery, where RGB corresponds to bands 5, 3, 2, respectively. This is accomplished via the script. In this example, we rescale the image to the 2nd and 98th percentile of pixel values when converting to 8-bit. The script below should be run for all 6 training areas of interest (AOIs): AOI_2_Vegas, AOI_3_Paris, AOI_4_Shanghai, AOI_5_Khartoum, AOI_7_Moscow, AOI_8_Mumbai.

Figure 1. Training images for our baseline. Left: sample image over Las Vegas (SpaceNet 3). Right: sample image over Moscow (SpaceNet 5).

The geojson_roads_speed folder within each area of interest (AOI) directory contains road centerline labels along with estimates of safe travel speeds for each roadway (see here for further details). We’ll use these centerline labels and speed estimates to create training masks. We assume a mask buffer of 2 meters, meaning that each roadway is assigned a total width of 4 meters. Remember that the goal of our segmentation step is to detect road centerlines, so while this is not the precise width of the road, a buffer of 2 meters is an appropriate width for our segmentation model. We explore two options, continuous masks and multi-channel masks.

Figure 2. Sample ground truth GeoJSON label, with a mask half-width of 2 meters.

One option for training a segmentation model is to create training masks where the value of the mask is proportional to the speed of the roadway. This can be accomplished by running the script. In the following example, we assume data has been downloaded to the /data directory. Outputs are shown in Figure 3.

Figure 3. Left: sample training image in Moscow. Middle: typical binary training mask. Right: continuous mask where the mask value is proportional to the value of interest (in this case, speed limit).

A second option for training a segmentation model is to create multi-channel training masks where each channel corresponds to a speed range. In the script below we bin in 10 mph increments, for a total of 7 bins. We also append the total binary mask (as we’ll see later, this aids in initial road extraction), for a total of 8 bins.

Figure 3. Left: sample training image in Mumbai. Middle: typical binary training mask. Right: multi-channel training mask, where red corresponds to 21–30 mph, green corresponds to 31–40 mph, and blue corresponds to 41–50 mph.

In this post we demonstrated how to prepare training masks for the SpaceNet 5 challenge. The output of the scripts referenced in this post (available in this notebook), can be fed directly into a deep learning segmentation model. Stay tuned for a forthcoming post that explores this segmentation step, and feel free to take a crack at the $50,000 prize pool for the ongoing challenge.



As of March 2021, CosmiQ Works has been folded into IQT Labs. An archive will remain here to showcase historical work from CosmiQ Works that took place July 2016 — March 2021.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store