Creating Training Datasets for the SpaceNet Road Detection and Routing Challenge
By Adam Van Etten and Jake Shermeyer
The SpaceNet Road Detection and Routing Challenge aims to automatically extract road networks directly from high-resolution satellite imagery. Such automated processes may help improve a vast array of problems, from the mundane (traffic) to the extreme (mass evacuation). In previous posts [1, 2] we detailed the evaluation metric for this challenge, and more recently we described the SpaceNet Roads Dataset; in this post we detail the steps to get started working with the roads dataset, with attendant code hosted on the APLS github page.
1. Dataset Challenges
One of the primary challenges of working with and eventually classifying remote sensing datasets is creating training data for ingestion into machine learning workflows. Remote sensing data is often formatted in a bespoke manner, which greatly complicates this process. The SpaceNet imagery is distributed as 16-bit imagery, and the road network is distributed as a line-string vector GeoJSON format, which poses a few challenges:
- Few programs are designed to display 16-bit imagery, necessitating conversion. (We recommend QGIS to view raw SpaceNet data.)
- Vector labels are rarely compatible with machine learning architectures, particularly in the computer vision realm. Conversion is therefore necessary.
- All roads are not created equal, as lane widths, number of lanes, and road surface all vary widely between differing geographic locales. The SpaceNet road network was created in such a fashion that only centerlines are labeled, regardless of roadway size. Since the goal of the challenge is to reproduce these centerline labels, the variance in road structure presents a challenge.
In this post, we focus on issues #1 and #2. We demonstrate methods for converting 16-bit imagery to standard RGB formats, and for extracting raster masks from the SpaceNet GeoJSON labels. We will discuss the variance in road structure in subsequent posts. Python code is included in the APLS github repository for the interested reader.
2. 16-bit Imagery Conversion
The high-resolution 30 cm resolution DigitalGlobe WorldView3 imagery in the SpaceNet Dataset is delivered in 16-bit format. While 8-band multispectral (MUL-PanSharpen) and grayscale (PAN) images are included, in this post we will focus on 3-band imagery (RGB-PanSharpen). As illustrated in Figure 1, most programs simply display 16-bit images as blank.
In everyday use, images are typically stored in 3-band 8-bit formats; for example a red pixel in a .png or .jpg file is denoted as (255, 0 , 0). Unsigned 16-bit imagery has a greater range of pixel values (from 0 to 65,535) and stores more information at the expense of greater storage space and reduced compatibility with analysis programs. While we encourage SpaceNet Dataset users to utilize the native 16-bit imagery, we find that visualizing results is often far easier with 8-bit imagery. Conversion is possible within packages such as NumPy, but in order to retain geographic data we utilize the gdal library. The code snippet below permits an end user to specify the percentile range. Specifically, this allows the removal of some outliers from 16-bit imagery, and preserves more relevant data when rescaling to the 8-bit range. The code can be found in apls_tools.py on the CosmiQ github repository.
def convert_to_8Bit(inputRaster, outputRaster,
Convert 16bit image to 8bit
rescale_type = [clip, rescale]
if clip, scaling is done strictly between 0 65535
if rescale, each band is rescaled to a min and max
set by percentiles
srcRaster = gdal.Open(inputRaster)
cmd = ['gdal_translate', '-ot', outputPixType, '-of',
# iterate through bands
for bandId in range(srcRaster.RasterCount):
bandId = bandId+1
band = srcRaster.GetRasterBand(bandId)
if rescale_type == 'rescale':
bmin = band.GetMinimum()
bmax = band.GetMaximum()
# if not exist minimum and maximum values
if bmin is None or bmax is None:
(bmin, bmax) = band.ComputeRasterMinMax(1)
# else, rescale
band_arr_tmp = band.ReadAsArray()
bmin = np.percentile(band_arr_tmp.flatten(),
bmin, bmax = 0, 65535
print "Conversin command:", cmd
3. Road Masks
The ultimate goal of the SpaceNet Roads Challenge is to extract a graph structure of road networks. Since the desired output is not the segmentation mask typically utilized for scoring (Post 1 provides a rationale for why we refrain from pixel-based metrics), any number of algorithmic approaches are possible. Nevertheless, one of the more obvious approaches to the challenge is to infer road masks from the 400 meter (1300 x 1300 pixel) imagery cutouts, and subsequently refine those inferred masks into a graph structure. In support of the segmentation approach, we create ground truth road masks for algorithm training.
Since the goal is to identify road centerlines, we refrain from attempting to mask the entire road width. Rather, we simply create a buffer (we use a buffer of 2 meters, yielding a total lane width of 4m) about the road centerline for use as a ground truth mask. Creating training masks is a two-step process.
- Using GeoPandas we ingest the SpaceNet GeoJSON labels into a GeoDataFrame. Utilizing the geometry values of the data, we then create a buffer about the road centerline. The code snippet below is from apls_tools.py and allows the user to specify the desired buffer width (bufferDistanceMeters).
Create a buffer around the lines of the geojson.
Return a geodataframe.
inGDF = gpd.read_file(geoJsonFileName)
# set a few columns that we will need later
inGDF['type'] = inGDF['road_type'].values
inGDF['class'] = 'highway'
inGDF['highway'] = 'highway'
if len(inGDF) == 0:
return , 
# Transform gdf Roadlines into UTM so that Buffer makes sense
tmpGDF = ox.project_gdf(inGDF)
tmpGDF = inGDF
gdf_utm_buffer = tmpGDF
# perform Buffer to produce polygons from Line Segments
gdf_utm_dissolve = gdf_utm_buffer.dissolve(by='class')
gdf_utm_dissolve.crs = gdf_utm_buffer.crs
gdf_buffer = gdf_utm_dissolve.to_crs(inGDF.crs)
gdf_buffer = gdf_utm_dissolve
2. We leverage the gdal library to convert the GeoDataFrame to a NumPy array, saving the array as an image. This process also ensures that the newly created mask image is snapped to the corresponding RGB image, and that the analogous pixel locations, rows, columns, and resolutions are identical. See apls_tools.py for code.
def gdf_to_array(gdf, im_file, output_raster, burnValue=150):
Turn geodataframe to array, save as image file with non-null
pixels set to burnValue
NoData_value = 0
gdata = gdal.Open(im_file)
# set target info
# set raster info
raster_srs = osr.SpatialReference()
band = target_ds.GetRasterBand(1)
outLayer = outDataSource.CreateLayer("states_extent",
burnField = "burn"
idField = ogr.FieldDefn(burnField, ogr.OFTInteger)
featureDefn = outLayer.GetLayerDefn()
for geomShape in gdf['geometry'].values:
outFeature = ogr.Feature(featureDefn)
outFeature = 0
gdal.RasterizeLayer(target_ds, , outLayer,
The create_spacenet_masks.py script executes all the code included above, yielding training masks and illustrations of the workflow.
# iterate through images, convert to 8-bit, and create masks
im_files = os.listdir(path_images_raw)
for im_file in im_files:
name_root = im_file.split('_')[-1].split('.')
# create 8-bit image
im_file_raw = os.path.join(path_images_raw, im_file)
im_file_out = os.path.join(path_images_8bit, im_file)
# convert to 8bit
# determine output files
label_file = os.path.join(path_labels,
+ name_root + '.geojson')
label_file_tot = os.path.join(path_labels, label_file)
output_raster = os.path.join(path_masks, 'mask_' \
+ name_root + '.png')
plot_file = os.path.join(path_masks_plot, 'mask_' \
+ name_root + '.png')
# create masks
mask, gdf_buffer = apls_tools.get_road_buffer(
The images below are examples of the output of the create_spacenet_masks.py script. We utilize the converted 8-bit images from Section 2 for visualization purposes.
The SpaceNet dataset contains over 8,000 km of hand-labeled and validated road centerlines, with attendant high-resolution 30 cm satellite imagery. This dataset provides the basis for the SpaceNet Road Network Extraction Challenge, with the goal of automatically extracting the road network graph structure directly from satellite imagery. Algorithms to perform such extraction will likely begin by training segmentation algorithms to identify road masks. To aid this process, in this post we demonstrated methods to transform native 16-bit DigitalGlobe imagery into a more manageable 8-bit format. We also provide a method for inferring ground truth road masks from the GeoJSON labels. Code and demos can be found on the APLS github page. In subsequent posts, we will explore image segmentation approaches using these road centerline masks.