SpaceNet Update — Announcing Rio de Janeiro Points of Interest (POI) Dataset Release

The goal of SpaceNet is to catalyze the development of new techniques to automate the analysis of imagery from remote sensors, including satellites. To achieve success in this effort, we have adopted a four pillar “open” strategy. The first pillar is to release openly licensed satellite imagery with associated geotagged labels that can be used to develop machine-learning algorithms. SpaceNet’s inaugural open data release was imagery of Rio de Janeiro taken from the WorldView-2 satellite at 50cm GSD using eight spectral bands. The machine learning labels for the original Rio imagery were building footprint polygons (see example below).

Rio’s Maracanã Zone

The second pillar of our strategy is to release open source software to help reduce the amount of GIS expertise required of machine-learning developers interested in aerial and satellite imagery. Our source code can be found in the SpaceNetChallenge community GitHub account. We are currently hosting two repositories: utilities and BuildingDetectorVisualizer. The utilities repository contains code that tiles large satellite images into smaller chips that are consumable by computer vision or machine learning frameworks. The utilities package also contains an evaluation metric for building polygon detection. The BuildingDetectorVisualizer repository contains two tools bundled together: a visualizer GUI application and a band extractor command line tool. The purpose of the visualizer application is to view 3-band or 8-band images with ground truth building footprints and a solution’s proposed building footprints as overlays. The end result is to compare ground truth to an algorithm implementation and calculate a final validation score.

The third pillar is to foster research on emerging analytical frameworks in open forums. Here are a few examples of open research using the SpaceNet data:

The fourth and final pillar of our strategy is to sponsor open analytics competitions. As you may have guessed from the data and source code released, our first competition focused on extracting building footprints. Congratulations to all of the winners. We plan to provide a detailed blog post on the DownLinQ dissecting the winning implementations. The implementations will also be open sourced and released as a SpaceNet repository.

As of this week, we have implemented all four pillars of our open strategy. So what does the future hold for SpaceNet? More data, more code, and more competitions! The SpaceNet team plans to launch another competition in two to three months that uses even higher resolution satellite imagery (i.e. 30cm GSD) over a wider variety of cities. Our plan is to study the impact of both the increase in location diversity and the increase in resolution on algorithm performance. In the meantime, there is currently a satellite imagery feature detection competition underway — offered by our colleagues at Dstl. Additionally, IARPA has been hosting interesting 3D detection challenges.

With respect to new data, we just released a POI dataset over Rio de Janeiro to SpaceNet. We would like to acknowledge the participation of the National Geospatial-Intelligence Agency (NGA) with the preparation of this research data licensed from Digital Globe to SpaceNet. The SpaceNet POI dataset is released under the Creative Commons license (i.e. CC BY-NC-SA 4.0). We are also releasing an additional 864 km² of 50cm raster imagery over Rio to increase the availability of imagery relevant to the POI dataset.

Rio Points of Interest

The POI geodatabase contains 12 datasets with 35 unique layers. The total release contains 120,155 individual points representing 460 features. There is a subset of 11,114 points across 139 features that have been confirmed with the provided satellite imagery. The confirmed points are likely a good starting point for machine learning research as they have been validated by a human as discernible in satellite imagery. More details on the individual layers can be seen in the table below.

The POI data download includes detailed metadata documentation for each layer. In addition, each layer has consistent fields to define the veracity of the points in that layer. For example:

  • SPA_ACC: Spatial accuracy of site location (1 — high, 2 — medium, 3- low). See the below image for a visual depiction of the three types of accuracy.
SPA_ACC examples
  • CONF_IMAGE: Satellite imagery confirmation (confirmed, assessed, reported, unconfirmed). Confirmed means that satellite imagery has been used to confirm the existence of the point (e.g. the point lies on a runway and is reported to be an airfield). Assessed means that an imagery analyst has viewed the point but cannot conclusively determine its existence. For instance, an unconfirmed point may be within a radius of a building but the analyst cannot determine if this is the actual place (e.g. an embassy located within a large office building). Reported means a non-imagery source has corroborated the point, but an analyst is unable to confirm the location due to lack of satellite imagery, poor quality imagery or the feature is concealed (e.g. a dam in a forest). Unconfirmed means that no analyst has looked at this point of interest in satellite imagery to confirm it on the ground.
CONF_IMAGE examples
  • ImgDate: Date of Image. If the image point is confirmed, this is the date of the image that was used to confirm the point.
  • spaceNetFeatureName (GeoJson Only). This is a custom feature created for SpaceNet to assist with creating a unique ID. The spaceNetFeature is the LayerName+any column with “TYPE” in it combined together with no separator. As an example, “Commercial_POIAutomotiveCar Rental” is the LayerName=”Commercial_POI”, TYPE1=’Automotive’, TYPE2=”Car Rental.”

We look forward to seeing applications and research on the newly released POI dataset.

Like what you read? Give Todd Stavish a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.