Down to Earth with AI Platform

Google Earth
Google Earth and Earth Engine
7 min readSep 16, 2019

By Nicholas Clinton, Developer Advocate, and Christopher Brown, Software Engineer, Google Earth Engine

Editor’s note: Earth Engine is now integrated with Google Vertex AI, as described in this 👉 newer post!

Last year, Google Earth Engine started down the path of integration with modern, neural-net powered machine learning by adding export and ingest of TensorFlow’s TFRecord data interchange format. This year, we’re announcing a new integration with Google Cloud’s AI Platform to connect your Earth Engine data directly to your TensorFlow models in real time. Predictions are no longer bottlenecked by exports and intermediate file formats, and scaling just became easier than ever. In this post, we’ll walk through a real world use case leveraging both the analytical capabilities of Earth Engine and the predictive power of TensorFlow.

The Problem

How much of the earth has been paved? Those kinds of questions are easier to handle with the new Earth Engine + AI tools. Here we’ll talk about a specific example (impervious surface), but there are a lot of other environmental phenomena we might like to map with the same method. The earth is (rather) big, so the satellite imagery in the Earth Engine data catalog is a good place to start. Manually marking up all those images could take a while, so how could we automate this process? As it turns out, getting information from satellite imagery is an ongoing challenge in the geosciences. Humans do a decent job because we see context: spatial patterns that signify forest, desert, water, city, etc. However, to do that at scale (e.g., nationally or globally), we need machines to help. We’d like to train a machine learning (ML) algorithm to extract information from imagery, not just from the spectral values in a pixel, but from spatial patterns of pixels, i.e. the spatial context. This becomes a problem for traditional machine learning algorithms (think decision trees, K-NN, etc…) because of the curse of dimensionality. If we have an image with many bands (if b is number of image bands, then b >> 3) and we want the machine to “see” big patches of pixels, like 256x256, then the number of input features quickly becomes too large. Furthermore, the spatial structure of the inputs is still not explicitly considered by the learning algorithm. Fully convolutional neural networks (FCNNs) are designed to handle these issues.

Can you tell if a pixel is part of a road by looking at it by itself? You get critical spatial context when you look at a patch of pixels compared to a single pixel.

What is an FCNN?

An FCNN is a deep learning algorithm specifically designed for images, where the spatial configuration of pixels might be relevant to the determination of what’s in an image. FCNNs learn from patches of data, instead of single pixels, which enables them to identify spatial patterns, in addition to spectral patterns. As such, training FCNNs requires patch data. Obtaining this patched-based training data can be difficult. Often it is generated by human image interpreters, but it is a time and labor intensive process. Fortunately, the USA has a program to produce high-quality national maps for free. These maps, collectively called the National Land Cover Dataset (NLCD), are well-characterized and accurate, benefitting from years of effort. Could we teach an ML algorithm to learn the patterns in these data in order to make predictions in other countries? Yes!

What Google tools will be used?

Fortunately, Google has all the tools we need to test this approach!

I’ll describe each of these components in detail, but first what are we predicting?

What is impervious surface?

One of the layers in the NLCD is human-built impervious surface percent. Think of this as the percentage of a pixel that’s been covered by concrete, pavement, buildings, roads, etc. This is an important indicator of development that is related to measures of land consumption and disturbance as described in the UN sustainable development goals. Our objective here is to use the NLCD dataset, which we assume is an accurate representation of impervious surface in the USA, to train a model to predict this same variable globally. Because there is a lot of spatial structure to these data (roads, cities, and developments), the FCNN in theory will be able to extract the patterns useful for prediction.

Data from Earth Engine

The NLCD and Landsat 8 surface reflectance data that will be used for model training and prediction are already in the Earth Engine public data catalog. The impervious surface data are scaled to [0, 1] from percent. Although the NLCD dataset creators use all manner of fancy variables to create their map, we only use prediction features that will be available globally. Specifically, we use nine bands from a cloud-free composite of Landsat 8 surface reflectance data. Seven bands are reflectance data (reflectance is unitless in [0, 1]). Two bands are thermal, so land surface temperature scaled to [0, 1] for temperatures between 0 and 100 C. The most recent NLCD impervious surface data are from 2016, so the cloud-free composite is generated from imagery in 2015–2017. Each training point is a 256x256 patch of scaled impervious surface stacked with the nine Landsat composite bands.

Where to get these patches? For this demo, I just sampled 256x256 (possibly overlapping) patches of pixels from some of my favorite cities. This is all pretty arbitrary, but the places represent a variety of ecoregions, climates, land covers and development typologies. In total, I used 20,000 such patches for training and 8000 for evaluation. Earth Engine makes it easy to export many patches to TFRecord files in Google Cloud storage, which can be read by TensorFlow directly. See this example notebook for the complete sampling workflow.

The FCNN model

Think of the model as a function that maps images to images. The model I used was a variant of U-Net, which can be tweaked to our use case without changing the network architecture. Specifically, I changed the output to a single node with a sigmoid activation function and used an RMS error cost function, which is appropriate for regression problems when are labels are bounded by, and often equal to, 0 and 1. The U-Net is implemented in TensorFlow using the Keras API and detailed in the example notebook.

Training the model

With a relatively complex model and a largish (we could always use more!) training data set, it takes a long time to train. AI Platform provides the service you can use to train such models in the background. The training data generated in Earth Engine, code to read and parse the data and the FCNN model code is all that’s needed to submit a training job to AI Platform. It takes about a day to train this model. Once the training is complete, you have a serialized version of the trained model sitting in a Cloud Storage bucket. This example notebook contains all the code necessary to train the model on AI Platform.

Hosting the model

Once you’ve got that trained model sitting in Cloud Storage, the real fun begins. We want to aim a firehose of pixels from Earth Engine at the model. Previously, you had to export imagery, perform inference, write the result and import it back to Earth Engine. That is a painful process for large areas. The new way is: you host the model on AI Platform, then point Earth Engine at the hosted model. That new functionality is represented in the API by ee.Model.fromAIPlatformPredictor. The new ee.Model package means you can get predictions from your TensorFlow mode interactively, anywhere there’s input imagery, using either the Code Editor or the Folium library in Colab. The example notebook contains a complete example.

What’s great is that now you can go to places you care about to see how the model does. For example:

Bangkok | Kinshasa | Sao Paulo
Bangalore | London | Beijing

Use your own models

The example model I present here is for demonstration purposes only. It hasn’t been optimized in any serious way (though AI Platform does offer hyperparameter tuning for that purpose). It hasn’t had a rigorous accuracy assessment, so it’s not obvious how valid the predictions are anywhere there’s not better data to use as a reference. The good news is that the framework is general, can be used right out of the box with your own data, and takes a lot of the pain out of scaling your TensorFlow models to make global predictions. There’s a comprehensive set of docs that illustrates the integration of Earth Engine with TensorFlow and AI Platform. Happy coding!

--

--