Image for post
Image for post

SpaceNet 7 Results: Overachieving Pixels

Adam Van Etten
Jan 29 · 5 min read

Preface: SpaceNet LLC is a nonprofit organization dedicated to accelerating open source, artificial intelligence applied research for geospatial applications, specifically foundational mapping (i.e., building footprint & road network detection). SpaceNet is run in collaboration by co-founder and managing partner CosmiQ Works, co-founder and co-chair Maxar Technologies, and our partners including Amazon Web Services (AWS), Capella Space, Topcoder, IEEE GRSS, the National Geospatial-Intelligence Agency and Planet.

In this post we dive into some of the building-level metrics for the SpaceNet 7 Multi-temporal Urban Development Challenge. We compare results to past SpaceNet challenges and note that despite the challenges of identifying small buildings in moderate resolution (4m) imagery, the pixels of SpaceNet 7 seem to overachieve when compared to SpaceNets past. A follow-up post will dive deeper into the temporal change and tracking lessons from this challenge.

1. Performonce vs IOU

For all five of the SpaceNet challenges focused on buildings (SpaceNets 3 and 5 explored road networks), we used an intersection over union (IoU) metric as the basis for SpaceNet scoring. This metric was illustrated in one of our SpaceNet 4 analysis blogs, see Figure 1.

Figure 1. A schematic representation of the Intersection over Union metric (originally from this blog). The overlapping area between the manually labeled ground truth (blue) and the predicted building (red) is divided by the combined area covered by both together.

An IoU of 0.5 or greater is typically used to determine success, and this threshold was adopted in previous SpaceNet challenges. Figure 2A illustrates performance for the winning SpaceNet 7 algorithm, scored over all 41 areas of interest (AOIs) in the public and private test sets. This test set contains over 4.4 million building footprints. For small objects the IoU is often reduced. Given the small size (in pixels) of buildings in the moderate (4.0m) resolution SpaceNet 7 dataset, for the SpaceNet 7 challenge we adopted a lower threshold of IoU ≥ 0.25 to mimic Equation 5 of ImageNet (see Figure 2B).

Image for post
Image for post
Figure 2. A: winning model performance with the true positive threshold IoU ≥ 0.5 (this was the threshold used by SpaceNet 1,2,4,6). B: winning model performance with the a threshold of IoU ≥ 0.25 (the threshold adopted for SpaceNet 7).

Figure 2 illustrates that the recall rate more than doubles if one drops the IOU threshold from 0.5 to 0.25. Analysis of the higher resolution (0.5m) imagery in SpaceNet 4 showed that the recall increased by only 15% when the IoU threshold was dropped to 0.25 from 0.5 (see Figure 2 here). Unsurprisingly, the moderate resolution imagery of SpaceNet 7 makes precise footprint prediction difficult, with a large fraction (~30%) of buildings having an 0.5 ≥ IoU ≥ 0.25.

2. Performance vs Building Area

Object size is an important predictor of detection performance, as noted in a number of previous investigations (e.g. [1, 2]). We follow the lead of analyses first performed in SpaceNet 4 (and later SpaceNet 6) in exploring object detection performance as function of building area. Figure 3 shows performance for all 4.4 million building footprints in the SpaceNet 7 public and private test sets for the winning submission of team lxastro0.

Image for post
Image for post
Figure 3. Building recall as a function of area for the winning SpaceNet 7 submission (IoU ≥ 0.25).

Figure 3 illustrates that building recall improves until an area of ~300 square meters, though the building area histogram peaks at ~160 square meters and then declines precipitously.

In Figure 4 below, we compare SpaceNet 7 performance to SpaceNets 4 and 6, both of which boasted a 0.5m resolution as opposed to the 4.0m resolution of SpaceNet 7. Here we use IoU ≥ 0.5 to directly compare the three challenges.

Image for post
Image for post
Figure 4. Comparison of building prediction recall (blue) for SpaceNets 4, 6, 7, overlaid on building histograms (red), with (IoU ≥ 0.5). Left: Winning SpaceNet 4 (originally published here) predictions from 0.5 optical data, here we focus on the blue (nadir) line. Middle: Winning SpaceNet 6 (originally published here) from 0.5m synthetic aperture radar data. Right: Winning SpaceNet 7 predictions from 4m optical data.

The building area histograms look similar in Figure 4 for SpaceNets 4 and 7, yet the performance curves are very different; SpaceNet 4 performance asymptotes at ~120 m², whereas SpaceNet 7 asymptotes at ~1000 m² with much lower recall. Of course the pixel areas are different by a factor of 64 (4m / 0.5)², so a 120 m² SpaceNet 4 building is a ~20 ⨉ 20 pixel square, whereas an 1000 m² SpaceNet 7 building occupies only a ~8 ⨉ 8 pixel square. So SpaceNet 7 predictions are actually superior to SpaceNets 4 and 6 when comparing comparable building pixel areas: a ~8 ⨉ 8 pixel square in SpaceNet 4 yields a recall of ~0.1, whereas in SpaceNet 7 the recall is ~0.55.

Figure 5 plots pixel sizes directly, demonstrating the far superior pixel-wise performance of SpaceNet 7 predictions in the small-area regime (~5 greater for 100 pix² objects), though SpaceNet 4 predictions have a far higher score ceiling.

Image for post
Image for post
Figure 5. Prediction performance as a function of building pixel area (IoU ≥ 0.5).

3. “Omniscient” Labels

The labeling process for SpaceNet 7 was an exhaustive 7-month effort that leveraged both the native Planet 4m resolution imagery, as well as higher-resolution imagery in particularly difficult scenes. By leveraging complementary data sources, the labelers were able to create what we have dubbed “omniscient” labels that appear to be far higher quality than the imagery merits. Figure 6 provides an illustration of a dense SpaceNet 7 region.

Image for post
Image for post
Figure 6. Zoom-in of one particularly dense region illustrating the very high fidelity of labels. (a) Raw image. (b) Footprint polygon labels. (c) Footprints overlaid on imagery.

Figure 6 shows that label precision exceeds what the human eye could easily distinguish in 4m resolution imagery. This high label fidelity helps explain the overachievement of the winning model prediction on small buildings. The labels encode extra information not present in the imagery, which models are able to leverage. Of course there is a limit (hence the score ceiling of SpaceNet 7 predictions), but this extra information does appear to help models achieve surprisingly good performance on difficult, crowded scenes, as in Figure 7.

Image for post
Image for post
Predictions for the winning model on a crowded region. Left: raw imagery. Right: Predictions overlaid, showing true positives (white), and false positives (yellow).

4. Conclusions

In this post we showed that predictions for the SpaceNet 7 4m resolution dataset perform admirably for very small buildings. In fact, Figure 5 showed that prediction performance for 100 pix² objects is ~5⨉ for SpaceNet 7 than for SpaceNet 4. The high fidelity “omniscient” labels of SpaceNet 7 seem to aid models for very small objects, though the lower resolution of SpaceNet 7 results in a lower performance ceiling for larger objects. Insights such as these have the potential to help optimize collection and labeling strategies for various tasks and performance requirements.

Stay tuned for an upcoming academic paper, as well as the SpaceNet 7 model release blog and a blog detailing the dynamic components of SpaceNet 7.

The DownLinQ

Welcome to the official blog of CosmiQ Works, an IQT Lab…

Thanks to Daniel Hogan

Adam Van Etten

Written by

The DownLinQ

Welcome to the official blog of CosmiQ Works, an IQT Lab dedicated to exploring the rapid advances delivered by artificial intelligence and geospatial startups, industry, academia, and the open source community

Adam Van Etten

Written by

The DownLinQ

Welcome to the official blog of CosmiQ Works, an IQT Lab dedicated to exploring the rapid advances delivered by artificial intelligence and geospatial startups, industry, academia, and the open source community

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store