Car Detection Over Large Areas With YOLT and Zanzibar Open Aerial Imagery

In conjunction with the SpaceNet dataset and challenge series that aims to democratize satellite imagery data and encourage the development of targeted algorithms, one of the goals of the CosmiQ team is to support the open source mapping community. To that end, CosmiQ is participating in the FOSS4G conference in Dar Es Salaam, Tanzania. FOSS4G is the largest annual global gathering of developers and users of open source geospatial software.

In this blog we explore how well one of the open source tools developed by CosmiQ (YOLT) performs on areas of interest to FOSS4G. Specifically, we analyze 7.5 cm resolution OpenAerialMap data collected via the Zanzibar Mapping Initiative over the island of Zanzibar just off the coast of Dar Es Salaam.

Model Training

As a test of the robustness of our algorithm, we apply a YOLT model trained on a very different dataset than the OpenAerialMap data over Zanzibar: 15 cm COWC aerial imagery collected over cities in Canada (Toronto), New Zealand (Selwyn) and Germany (Potsdam). Recall that YOLT is an object detection algorithm designed to rapidly localize objects over large areas in overhead images of arbitrary size. See our previous post for further training details.


The OpenAerialMap test dataset is collected with a different sensor, a different resolution (7.5 cm vs 15 cm for COWC), and over a very different geographic region than our training set. We select a detection threshold designed to minimize false negatives (cars that go undetected), and as a result there are a few false positives in the images. We unfortunately lack ground truth labels for cars in Zanzibar and so cannot compute rigorous performance metrics.

Performance and Conclusions

Inference runs rapidly at 9 square kilometers per minute on a single GPU. At this inference rate, running YOLT on an Amazon EC2 P3 instance for the entire ~1000 square miles of Zanzibar takes less than half an hour. Alternately, running on a lightweight CPU-only machine is possible (albeit far slower) and still runs at a rate of ~1 square kilometer per hour per CPU.

We look forward to further work to quantify performance, but for now the images below appear encouraging with very few missed cars (false negatives) and relatively few false positives. As we continue our algorithmic mapping efforts, future posts will also explore the application of building and road network detection algorithms trained on SpaceNet satellite data to aerial imagery datasets.

Figure 1. Zoom of detected cars (cyan boxes) in northwest Zanzibar.
Figure 2. 500 x 400 meter region of Zanzibar City.
Figure 3. Large image covering 1.3 square kilometers of Zanzibar City. There are ~2000 detected cars in this image.