Object Detection in Satellite Imagery, a Low Overhead Approach, Part II

Adam Van Etten

Follow

Published in

The DownLinQ

5 min readSep 1, 2016

--

In this post we show improved results for object detection in congested regions via a novel method for collapsing overlapping detections using heading information. We also demonstrate that for satellite imagery object detection, rotated bounding boxes have advantages over rectangles oriented in the cardinal directions.

The promise of detecting and enumerating objects of interest over large areas is one of the primary drivers of interest in satellite imagery analytics. Detecting small objects in cluttered environments over broad swaths is a time consuming task, particularly given the ongoing growth of imagery data. In the previous post, we showed how to perform object detection on satellite imagery with just a laptop and open source software by combining Canny edge detector pre-filters with HOG feature descriptors, random forest classifiers, and sliding windows. We found that performance in open-water localization is quite good when square non-max suppression is used to combine overlapping detections, though congested regions remain a challenge.

1. Object Detection Pipeline

Recall from Part I that our object detection pipeline combines Canny edge detector pre-filters with HOG feature descriptors, random forest classifiers, and sliding windows. This process takes < 30 seconds for the image shown in Figure 1.

Figure 1. Identical to Figure 8 from Part I. Sliding window outputs with the HOG + 72 binary random forest classifiers for area of interest 1 (AOI1). Detections with at least 90% confidence are plotted. There are a total of 951 detections, many of which are overlapping due to the high overlap of the sliding window. There are a high number of false positives along docks, piers, and shoreline, though very few false negatives (Imagery Courtesy of DigitalGlobe).

Square non-max suppression collapses some overlapping detections in open water, but works poorly in congested regions (see Figure 2).

2. Performance Evaluation

We can combine the output of non-max suppression shown in Figure 2 with the ground truth labels of Figure 1 in Part I to evaluate the performance of our object detection methodology. We compute and plot three categories: false positives, false negatives, and true positives. We define a true positive as having a Jaccard index (also known as intersection over union) of greater than 0.25. A Jaccard index of 0.5 is often used as the threshold for a correct detection, though as in Equation 5 of ImageNet we select a lower threshold since we are dealing with very small objects.

3. Congested Region Overlap Suppression

Figure 3 illustrates the poor performance in congested regions of standard rectangular bounding boxes oriented in cardinal directions. In order to improve results, we take advantage of the heading information provided by our classifier and knowledge of object aspect ratio to create an improved method for overlap suppression using rotated rectangular bounding boxes. This method greatly improves localization in crowded regions. Computing rotated rectangular overlap suppression for the 951 detections takes 3.3 seconds.

Figure 4. Zoomed examples of rotated rectangular overlap suppression applied to Figure 1. Rectangles are oriented along boat heading. Localizations in harbor are much improved from Figures 2 and 3 (Imagery Courtesy of DigitalGlobe).

As a check of our results in Figure 5, we apply the object detection pipeline to three more areas of interest with hand-labelled ground truth. Results are varied, and described in figure captions.

Figure 6. AOI2 localization results, applied to boats in open water. Detections with at least 90% confidence are plotted. The only false negatives (yellow) are boats smaller than 10m. Ground truth is in blue, green denotes true positive predictions, and red shows false positives. For 38 ground truth boxes and 36 proposed boxes we achieve precision of 0.97 and recall of 0.92 (Imagery Courtesy of DigitalGlobe).

Figure 9. Object detection performance for each of the four areas of interest.

9. Conclusions

In this post we showed results of combining Canny edge detector pre-filters with HOG feature descriptors, random forest classifiers, and sliding windows to perform object detection on satellite imagery using only a laptop and open source software packages. We utilize the heading information provided by our classifier to introduce a new technique for rotated rectangular overlap suppression that greatly improves results in crowded regions.

Searching for boats of length [140, 100, 83, 66, 38, 22, 14, 10] meters, the entire classification pipeline takes < 30 seconds for our test image, translating to ~15 minutes for a full 8x8 kilometer DigitalGlobe image on a single CPU. This run-time could be greatly reduced by looking only for larger boats. For example, searching for boats of length greater than 20m takes only ~3 minutes for a full DigitalGlobe image, a marked decrease from 15 minutes.

The detection pipeline performs very well in open water on boats of similar size to the training set median size (AOI2: precision=0.97, recall > 0.92). Open water detection is less impressive for larger boats in choppy seas due to the lower contrast between object and background and fewer training images at this scale (AOI3: precision=0.70, recall=0.54); one could improve performance by relaxing detection thresholds and only looking for boats greater than 50m in length. Performance in harbor is quite high (AOI4: precision=0.84, recall=0.87), with the number of proposed items within 5% of ground truth.

For low background images the HOG + sliding window approach appears a very compelling option, particularly given its speed and ability to elucidate dense regions with closely spaced objects. Forthcoming posts will explore neural network based approaches to object detection in satellite imagery. Though neural networks and deep learning have enjoyed great success as of late, few neural network models are optimized to localize small, densely packed objects in large images. It will be interesting to see how advanced deep learning techniques compare to the low computational overhead approach proposed here.

Object Detection in Satellite Imagery, a Low Overhead Approach, Part II

1. Object Detection Pipeline

2. Performance Evaluation

3. Congested Region Overlap Suppression

9. Conclusions

Written by Adam Van Etten