Deploying the SpaceNet 7 Baseline on AWS

Daniel Hogan
The DownLinQ
Published in
6 min readSep 25, 2020

Adapted by Daniel Hogan from a post by Adam Van Etten and Nick Weir

Preface: SpaceNet LLC is a nonprofit organization dedicated to accelerating open source, artificial intelligence applied research for geospatial applications, specifically foundational mapping (i.e., building footprint & road network detection). SpaceNet is run in collaboration by co-founder and managing partner CosmiQ Works, co-founder and co-chair Maxar Technologies, and our partners including Amazon Web Services (AWS), Capella Space, Topcoder, IEEE GRSS, the National Geospatial-Intelligence Agency and Planet.

The SpaceNet 7 Challenge centers on a unique time-series dataset, with monthly collects over an approximately-two-year period for more than 100 locations spanning the globe. Challenge participants are asked to identify building footprints in the imagery and, furthermore, to track buildings from month to month revealing the temporal structure of the data. The challenge is currently underway. With a Challenge end date of October 28, now is a great time to sign up!

Deep learning techniques, which have increasingly become the norm for complex tasks like this, require powerful hardware usually in the form of graphical processing units (GPUs). To help democratize access to the tools needed for the SpaceNet 7 Challenge, this blog post shows step-by-step how to run our baseline (i.e., example) solution online with Amazon Web Services (AWS). No special hardware on the user’s part is required, and the total cost to train the model is around $60. To further democratize the process, the SpaceNet team is providing AWS compute credits to SpaceNet 7 competitors. Specifically, the first 100 competitors to reproduce (or exceed) the score of the open source baseline model will receive a $250 credit, thus allowing a large number of competitors to freely experiment and improve upon their submissions.

I. Introduction

To lower the barrier to entry for SpaceNet 7 participation, the CosmiQ Works team created an Amazon Machine Image (AMI) pre-loaded with the Solaris software suite, SpaceNet 7 baseline algorithm, and the SpaceNet 7 dataset. This post walks through how to use that AMI to train and test the baseline model in the cloud. No hardware beyond an ordinary laptop is required. Challenge participants have the option of using the baseline as a starting point as they set off to develop their own innovative solutions to the SpaceNet 7 Challenge.

II. Loading the AMI

In order to load and prepare the AMI for model training and testing, simply execute the following steps:

  1. In a web browser navigate to:
https://console.aws.amazon.com/ec2

If you don’t have an AWS account, create one as a “Root user” and login.

2. In the top-right of the page, ensure you are in the “N. Virginia” region:

3. Select “Launch Instance” (orange button)

4. Search for the pre-built AMI

Type “CosmiQ_SpaceNet7_Baseline_v1” in the search bar, click the “Community AMIs” tab, and hit “Select.”

5. Select the appropriate instance

We recommend the p3.2xlarge instance, which includes one NVIDIA P100 GPU and costs $3.06 per hour. Hit “Review and Launch.”

6. Initiate launch

7. Create a new Key Pair

Create a new key pair (e.g. cosmiq_sn7_baseline), and save to your local machine. Hit “Launch Instances.”

Note: It is possible that AWS may throw an error stating that you have requested more vCPU capacity than allowed (this is an intentional feature put in place to keep users from unwittingly accumulating large bills). In this case, open a ticket for a service limit increase at: https://console.aws.amazon.com/support/home#/case/create.

8. Record the address

The instance is now running! Back at the Instance Dashboard you should write down the Public DNS (IPv4) for this instance (of form: ec2-XX-XXX-X-XX.compute-1.amazonaws.com), you will need this later to access the instance.

9. Access the instance via SSH

On your local machine, ssh into the AMI from a terminal. You may need to change permissions of the key (.pem) file first.

chmod 400 /path_to_keys/cosmiq_sn7_baseline.pemssh -i "/path_to_keys/cosmiq_sn7_baseline.pem" ubuntu@ec2-XX-XX-XX-XX.compute-1.amazonaws.com

III. Train and Test the Baseline Model

The GPU instance is now up and running, so we can train and test the SpaceNet 6 baseline algorithm, which is pre-installed in a Docker container (entitled cqw_sn6_bl) in the AMI. The command below should be run in the ssh terminal of your instance.

  1. Attach the docker container
nvidia-docker start cqw_sn7_blnvidia-docker attach cqw_sn7_bl

This puts us in the the /root directory within the docker container where we can execute the train.sh and test.sh scripts.

2. Train the model

By default the script runs for 300 epochs (~20 hours). To modify the training time, simply edit the “epochs” line in the yml/sn7_baseline_train.yml file. The training script first pre-processes the data, then applies a VGG-16 + U-Net deep learning model. Training is launched by simply invoking the train.sh script:

./train.sh /data/train

Training will proceed either until completion of all epochs, or training is terminated by the user (ctrl+c). The best model is saved as training progresses, so early termination still saves the best model trained to date.

3. Test the Model

Testing can be executed either when training is complete (Step 2 above), or using the existing weights file models/sn7_baseline/xdxd_final.pth in the docker container. Testing is invoked with the following command (sn7_baseline_predictions.csv is the name of the output file):

./test.sh /data/test_public sn7_baseline_predictions.csv

4. Copy outputs back locally

Inference is now complete, so we now transfer the outputs out of the AMI back to our local machine. First, switch out of the Docker container (ctrl+p ctrl+q) then copy the results from the docker container back to the AMI:

mkdir /home/ubuntu/transferdocker cp cqw_sn7_bl:/root/sn7_baseline_predictions.csv /home/ubuntu/transfer/docker cp cqw_sn7_bl:/root/output /home/ubuntu/transfer/docker cp cqw_sn7_bl:/root/inference_out /home/ubuntu/transfer/

Now copy results back to your local machine (for example, to location: /path_to_results/sn7_baseline_aws/). To just get the CSV file of predictions, the command is:

scp -i "/path_to_keys/cosmiq_sn7_baseline.pem" -r ubuntu@ec2-XX-XX-XX-XX.compute-1.amazonaws.com:/home/ubuntu/transfer/sn7_baseline_predictions.csv /path_to_results/sn7_baseline_aws/

Or, to get all the intermediate and output files (masks, etc.), use:

scp -i "/path_to_keys/cosmiq_sn7_baseline.pem" -r ubuntu@ec2-XX-XX-XX-XX.compute-1.amazonaws.com:/home/ubuntu/transfer /path_to_results/sn7_baseline_aws/

5. Inspect Results

Among the files generated by the SpaceNet 7 baseline are images identifying which pixels are most likely to be parts of buildings.

An image from the public_test data (left) and the model’s assessment of which pixels are most likely to be parts of buildings (right). Applying a threshold to the image at right generates a prediction mask.

The baseline also creates a CSV file with a geometry and a tracking ID number for each building prediction (see below) — this file yields a score of ~0.158 when entered onto the challenge website.

SpaceNet baseline building footprint coordinates in challenge entry format.

Side Note

As an aside, the AMI was built using a fork of the original baseline repo. The original repo runs many of the computations within Jupyter Python notebooks, which the fork repackages as Python scripts. But under the hood, both versions do the same computations in the same way.

IV. Conclusion

The SpaceNet 7 Challenge focuses on the difficulties and opportunities of working with time series data. To make this task as approachable as possible, we have previously released a working example solution in the form of the algorithmic baseline. In this post, we’ve released an Amazon Machine Image with everything necessary to use the baseline, along with step-by-step instructions to undertake the deep learning process in the cloud. The baseline and this platform can be an effective way to get started on SpaceNet 7. We encourage the interested reader to register for the challenge, implement this code, and apply for the free AWS compute credits to experiment with and improve upon the baseline model.

--

--

Daniel Hogan
The DownLinQ

Daniel Hogan, PhD, is a senior data scientist at IQT Labs and was a member of CosmiQ Works.