Accelerating Disaster Response at the Intersection of Space and ML

Automating Object Detection

Gopal Erinjippurath

Published in

Planet Stories

8 min readNov 28, 2018

Flooding of Port Arthur, Texas in the aftermath of Hurricane Harvey. SkySat imagery captured on September 1, 2017

The Need for Speed

The past decade has seen satellite imagery become a critical information source for supporting emergency response and disaster relief. From hurricanes to floods to wildfires, satellite imagery often provides the first macro view of the destructive impacts of a disaster. At Planet, we have distributed our imagery to first responders in the aftermath of several disaster events, from Hurricane Harvey to the recent, fatal Camp Fire in California.

Despite the wealth of insights contained in satellite imagery, visual inspection and analysis of imagery can take a lot of time, and in the case of disasters, every minute counts when it comes to a response. The ability to automate (and thus accelerate) analysis of satellite imagery to support first responders and relief efforts would be hugely impactful. This motivated Planet’s Analytics team to build analytic capabilities specifically for detecting relevant objects before and after disaster.

In a previous post, we explored salient features for spatiotemporal geospatial analytics. In this post, I describe our process of setting up a data curation funnel, engineered to create a high quality dataset for object detection and localization, followed by benchmarking performance of state-of-the-art deep learning on this dataset. Inspired by the temporal stack of imagery in disaster regions, we highlight a method for extending models to work on spatiotemporal datasets like the one we are constructing for improved object detection performance.

Building a Disasters Dataset for Object Detection

To begin, we identified locations of natural disasters that occurred around the world over the course of 2017 through 2018 and parsed through our imagery archives for corresponding imagery. This is what our global sampling footprint looked like:

Global Sampling of Disaster Regions over 2017-2018 (left) *Example sampling from spatial extent of different sensors over a disaster region (right)*

Zooming in on a sample footprint over the region in Miami, Florida, affected by Hurricane Irma in September 2017, we identified a number of SkySat and PlanetScope scenes before and after the impact of the hurricane. Below is an illustration of the spatial extent of scenes from sensors in different satellites. From these collects, we identified sample regions that overlap with the areas on the ground that were impacted by the disaster.

*Data collection funnel using OSM filters and multi-stage annotation*

From larger areas of interest around disaster regions, we sampled scenes (large images) using Open Street Map (OSM) filters for objects that we are interested in detecting and localizing. For example, to localize railway vehicles, we searched over the region with OSM tags of railway=station or the public transportation tag of public_transport=station. In certain cases, we further classify objects based on their tags in OSM. This is particularly useful at creating the ontology of buildings.

For this project, we leveraged crowd annotations to create our training dataset. We have previously observed that serving crowd annotators with large images leads to less accurate annotations on dense small objects. Therefore, we needed to split the scenes into smaller size images, or chips. These crowd-annotated chips were then reviewed and optionally corrected by experts to create a high-quality training dataset, which we internally refer to as a “gold standard”.

Disaster object ontology derived from SkySat imagery

By following this approach, we developed an ontology of objects in disaster regions, for both our PlanetScope and SkySat satellite constellations, that can be discerned and localized in our imagery. This ontology contains objects and classes of objects that are visible in our imagery. SkySat being at a higher resolution provides a superset of the ontology derived from PlanetScope imagery. These can be localized with close to human level precision using automated analytics.

Our aim was to capture sufficient variability in context around these objects in disaster regions. Object detection models trained on such varied conditions allow for generalized object detection performance that would be necessary when detecting the displacement of dynamic objects, and variations in object counts for static objects in disaster regions, a central capability for automated disaster analytics.

*Dataset and category size of Planet dataset relative to computer vision challenge datasets*

Planet’s temporal cadence allows for fast automated annotation of static objects as well as fast localization of moving objects. In just a couple of weeks, the dataset from the resulting collection is comparable to challenge datasets like Pascal Visual Object Classes. This dataset is also representative of objects localized over varying atmospheric conditions that are captured in our imagery over different times of the year.

Characterizing Model Performance

Performance over objects before expert curation of dataset (left) Performance over objects after expert curation of dataset. (right) SS indicate categories in the SkySat ontology and PS indicate categories in the PlanetScope ontology

To use the dataset for automated disaster analytics, we built single object detectors using the open-source Single Shot Detection (SSD) with a Resnet-101 backbone architecture and evaluated their performance on a test set. Because this was a benchmarking exercise we did not tune the models. Therefore it’s important to note that these are performance numbers without any model tuning — just stock, open source models with a candidate set of hyper-parameters trained on this dataset. The expert curation of the crowd-annotated dataset into a “gold standard” gave a significant performance boost to the models, illustrating the importance of high-quality annotations to the success of object detection pipelines.

Performance characterization F1 vs IOU for different size objects for baseline Faster RCNN and SSD models (left); Precision-Recall over different IOU thresholds for baseline Faster RCNN and SSD models (right)

We can also assess object detection performance based on size. By looking at maritime vessels of different sizes over different intersection over union (IoU) thresholds, we can determine the ideal thresholds for numerous objects of interest when serving detection predictions. This allows for fully automated analytics with an acceptable level of precision (limiting false positive detection) or recall (limiting false negative misses).

Building on the benchmarks

With a robust collection of annotated imagery and a pipeline for curation, this benchmarking activity demonstrates the effectiveness of Planet’s imagery for disaster response. However, the process described here are first steps towards a fully automated object detection capability.

Planet images the Earth on near-daily cadence, providing a unique spatiotemporal dataset. With this temporal resolution, we can leverage changes between images to identify and model moving objects over time. The figure below shows this temporal imagery stack, with the colors of the middle image indicating motion, a cargo ship leaving with a new load, or smaller boats docking in a marina. It’s this combination of high resolution and high cadence imagery that allows Planet to detect not only large-scale events, but also the small, high-cadence changes that can impact a community in the wake of a disaster.

Standard Faster RCNN architecture (top left) Modified Faster RCNN architecture operating on spatiotemporal data (bottom left); Example of temporal region proposals on a temporal image composed of multiple time stamps (right).

As we explore the temporal aspects of this dataset, we see a unique opportunity to better detect moving, or non-stationary, objects. If you can also detect moving objects before a disaster, you are likely to have a more precise understanding of object counts, which helps damage assessments after the disaster strikes.

To detect moving objects, we looked for small modifications to existing networks that allow for improved performance for a specific use case. One could think of the Faster Regional Convolutional Neural Network as a split deep convolutional network where region proposals are created using the Regional Proposal Network (RPN) on an intermediate feature representation and pooled to create bounding boxes and class predictions. By feeding a temporal sequence of imagery ground locked to a particular spatial extent, we can identify the presence or absence of objects. This would require modifying the deep stack of feature maps to use a 3D convolution before feeding to the RPN.

Let’s look at this proposal network in action. The color images are three time stamps of grayscale imagery from the same location. Colors indicate motion, and at times, registration artifacts. The bounding box proposals indicate the outputs from the 3D convolution + RPN when projected onto the latest source image, which from the example above indicate the objects that appear in the latest image but are absent in the past images.

Object Detection towards rich geospatial analytics

Going from Earth observation imagery in disaster regions to object detection followed by data fusion towards multi-source geospatial analytics

With the information from automated object detection, we can potentially “query” trends and make predictions. What we at Planet call “Spatial Information feeds” are real-time models that provide a stream of a specific type of object/feature derived from our imagery, which users can access through a simple user-friendly API. With automated daily counts, users potentially won’t have to look at the image every day but rather get an alert when there’s an anomaly.

To build rich geospatial applications, however, we need to go beyond just information from imagery. In the context of a natural or manmade disaster, there is a wealth of data sources could enhance imagery: Automatic Identification System (AIS) data can identify vessels and their routes; real-time traffic data gives a picture of road conditions during a flood; cell phone telemetry can complement roads and show where people are. Fusing information from cell phones, connected cars, Internet of Thing sensors, and beyond will allow users of such applications to understand what is happening in the world as it happens. These are the applications that we are enabling our partners and customers to develop.

To learn more, visit our Planet Analytics and Emergency and Disaster Management webpages. We are actively hiring, so if the scale and scope of this work interests you, check out Planet’s Careers page. Like this story? Give it a clap and subscribe to the Planet Stories to get the latest at the intersection of Space and ML.

Originally published at medium.com.