Using Generative Adversarial Networks to Address Scarcity of Geospatial Training Data

Results show models based on Generative Adversarial Networks perform better than Convolutional Neural Networks in classifying land cover classes outside of the training dataset.

Published in

Radiant Earth Insights

5 min readOct 5, 2020

By Hamed Alemohammad, Executive Director and Chief Data Scientist and Aditya Kulkarni, former Machine Learning Intern at Radiant Earth Foundation

In many supervised machine learning (ML) applications that use Earth observations (EO), we rely on ground reference data to generate training and validation data. These reference data are the building block of those applications and require geographical diversity if one aims to deploy the models across various geographies.

Ground reference data collection, however, is an extensive process and extremely scarce in remote areas that would most benefit from the use of EO. Therefore, beyond our common efforts to generate and publish high-quality training datasets for these applications, we need to develop innovative techniques to utilize the limited existing ground reference data better.

This year, we were awarded a grant by Bill and Melinda Gates Foundation as part of the 2019 Grand Challenges Annual Meeting Call to Action to explore the use of Generative Adversarial Networks (GAN) to tackle the scarcity of training data for agricultural monitoring applications. This joint project with our collaborators Ernest Mwebaze from Google AI Accra, and Daniel Northrup from Benson Hill, aims to use GANs as 1) a replacement for common classification models such as Convolutional Neural Networks (CNN), and 2) a tool to generate synthetic training datasets.

This two-part blog series will introduce our project and initial finding on using GANs with medium-resolution multispectral Sentinel-2 satellite imagery.

Generative Adversarial Networks (GAN)

Introduced by Ian Goodfellow in 2014, GANs are a class generative models consisting of a generator and a generator and discriminator model. The generator model uses noise as input and tries to generate images similar to real images to “fool” the discriminator. At the same time, the discriminator tries to learn from the real data and detect images produced by the generator as “fake” (Figure 1).

*Figure 1 — Architecture of a GAN showing the generation and discriminator models.*

During training, generator and discriminator models progressively learn to produce realistic images that are similar to real images provided to them. Several examples of this technique have shown promising results in learning the patterns in real data (e.g., images on the thispersondoesnotexist.com website displaying headshots generated by a GAN model).

One application of GAN is image-to-image translation. In this application, we aim to transform the input image into another image with a different distribution. For example, one may use this to change an input image’s color composition or change the scene from daylight to nighttime. One useful application of image-to-image translation is image segmentation. In this case, the target image is a one-band segmentation layer of the input image.

While the original image-to-image translation paper demonstrated translation of Google basemap imagery to Google map, this is the first time we are applying this technique to 10m spatial resolution imagery from Sentinel-2 for a land cover (LC) classification problem.

Study Design

In this first phase of the project, we used Sentinel-2 imagery along with LC labels from the National Land Cover Database (NLCD) in the US to test our model. Since NLCD data is available for all US, it allows us to build separate training and test dataset and evaluate our GAN model’s performance vs a CNN one. Both models will be tasked to estimate LC class for each pixel in a 256 x 256 image as a segmentation problem. In the next steps of this project, we will use this technique to evaluate GAN in geographically distinct regions across Africa, and for crop type classification.

We created a training dataset from cloud-free images of Sentinel-2 during May-June 2016 across the continental US and used the 2016 NLCD LC classes as labels. For this experiment, we only use the 10m bands from Sentinel-2, namely red, green, blue, and near-infrared (NIR). Similarly, we created a test dataset from other regions within the continental US but with similar LC classes. A total of ~16K images were included in the training dataset and ~7K in the test dataset.

There are six classes in the dataset: open water, developed, forest, grassland, pasture, and cultivated. Training and test data both have a class imbalance, but the distribution of classes in the two datasets are relatively similar. Water and developed classes have the lowest number of samples, and forest and cultivated are the most populated ones.

Can a GAN model generalize LC classification to unseen data?

It’s been shown in non-geospatial applications that GANs can generalize better than CNNs. This means that after training each of these models on the training data, GANs tend to have higher accuracy on test data that the model hasn’t seen during training.

To have a fair comparison between GAN and CNN models, we designed their architectures with a similar number of parameters (44 M and 42 M, respectively). Both models were trained on the training dataset, and after the completion of training the CNN model had higher F1-accuracy for all classes but open water compared to GAN. Both models had very similar scores for open water pixels.

Predicting LC classes on the test data, and comparing the F1-score of the models gives a very different picture though. The GAN model performs better than CNN in the four classes of open water, developed, forest, and cultivated land. CNN has a better performance in the grassland and pasture classes. This result shows that our GAN model is better able to generalize to unseen data for LC classification. Figure 2 shows three examples of Sentinel-2 imagery with the NLCD labels as well as their GAN and CNN prediction.

Figure 2 — Three examples of LC classification using GAN and CNN. Each row is one 256 x 256 image from Sentinel-2 at 10m resolution. The first column is the RGB image, the second column is the NIR band, the third column shows the true label from NLCD, the fourth column shows the LC classes predicted by the GAN model and the last column shows the LC classes predicted by the CNN model. (class colors: red: developed, blue: open water, cyan: pasture, dark green: forest, light green: grass, brown: cultivated).

Check our paper presented at the AI for Earth Workshop at NeurIPS 2020 for more details.

This research is funded by a grant awarded to Radiant Earth Foundation through the 2019 Grand Challenges Annual Meeting Call-to-Action from the Bill & Melinda Gates Foundation. The findings and conclusions contained within are those of the authors and do not necessarily reflect the positions or policies of the Bill & Melinda Gates Foundation.

Using Generative Adversarial Networks to Address Scarcity of Geospatial Training Data

Results show models based on Generative Adversarial Networks perform better than Convolutional Neural Networks in classifying land cover classes outside of the training dataset.

Written by Radiant Earth