Today, in collaboration with AI.Reverie, we are proud to announce the release of the RarePlanes dataset, research paper, and codebase. RarePlanes is a unique open-source machine learning dataset that incorporates both real and synthetically generated satellite imagery. The dataset specifically focuses on the value of synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery. Although other synthetic/real combination datasets exist, RarePlanes is the largest openly-available very-high resolution dataset built to test the value of synthetic data from an overhead perspective.
The real portion of the dataset consists of 253 Maxar WorldView-3 satellite scenes spanning 112 locations and 2,142km² with ~14,700 hand annotated aircraft. The accompanying synthetic dataset is generated via the novel AI.Reverie simulation platform and features 50,000 synthetic satellite images with over ~600,000 aircraft annotations. Both the real and synthetically generated aircraft feature 10 fine grain attributes including: aircraft length, wingspan, FAA wingspan class, wing-shape, wing-position, propulsion, number of engines, number of vertical-stabilizers, if it has canards, and aircraft role. The paper also showcases many experiments to evaluate the real and synthetic datasets and compare performances. By doing so, we show the value of synthetic data for the task of detecting and classifying aircraft from an overhead perspective.
The dataset is made available via the AWS Open Data Program, permissively licensed (CC BY-SA 4.0), and can now be downloaded for free. All you need is an AWS account and the AWS CLI installed and configured. Once you’ve done that, simply run the command(s) below to download the datasets to your working directory!
Real (~107 GB):
aws s3 cp --recursive s3://rareplanes-public/real/tarballs/ .Synthetic (~211 GB):
aws s3 cp --recursive s3://rareplanes-public/synthetic/ .Model Weights (~4 GB):
aws s3 cp --recursive s3://rareplanes-public/weights/ .
The paper details the dataset and baseline experiments we conducted and can be read here:
We also provide pre-processing code to work with the dataset, create labels, as well as up to 110 custom classes using combinations of the attributes:
The User Guide
Finally we provide a user-guide as well as a full listing of all of the content featured in this blog post, which can be found on the CosmiQ Works website:
Although this post represents the end of the runway on the initial RarePlanes research study, we plan to have more great RarePlanes content coming up. Watch the DownLinQ and the skies and you will see some more planes in the future.