Querying and Downloading Satellite Images with Google Earth Engine

David Nagy
Project Canopy
Published in
4 min readJan 16, 2021
Congo Basin custom Project Canopy mosaic, captured with Google Earth Engine

(co-authored by Zhenya Warshavsky)

Searching for, downloading and displaying 2.5 million square kilometers of satellite imagery is no small task. But Project Canopy, a new NGO focusing on evidence-based policymaking for the Congo Basin rainforest, orgnaizations need a comprehensive overview of the rainforest in order to make effective decisions. As the NGO’s data scientists, we have been working over the last nine months to create such a prototype: a near-real-time logging road detection and notification system, powered by machine learning, and available across the entire Congo Basin rainforest.

Our previous article described how we went about deciding which platform to use in order to (a) query for specific Sentinel-2 satellite image products and (b) actually download those images. We ended up rejecting Google Earth Engine because we could not find a way to easily and reliably accomplish both those tasks. However, after further research, we found out we were mistaken; though somewhat counter-intuitive, there is indeed a way to (relatively) easily both query and download satellite images using Google Earth Engine (Google EE). This short post will explain how.

Querying

While there is no explicit “query” function in the Google EE API, there is a way to replicate it: by filtering an ImageCollection object.

An Image Collection, as the name implies, is a group of ee.Image objects. While you can just directly create a Collection from a list of Images, it’s also possible to pass in the name of a folder in the Google EE database. When you do that, you’ll get a collection of every image in that folder. Specifically, if you want Sentinel-2 images, create one of the following ee.ImageCollection objects, depending on if you want L1C or L2A (see our first post for more on this difference):

collection = ee.ImageCollection(“COPERNICUS/S2”) #L1Ccollection = ee.ImageCollection(“COPERNICUS/S2_SR”) #L2A

Of course, you’ll likely not want all the Sentinel-2 images. But after creating the collection, you can then filter it, to drill down to the specific images you want. For example, you can pull out only images in a specific timeframe like so:

start_date = ‘2020–01–15’ # can be any dateend_date = ‘2020–02–15’ # can be any datecollection = collection.filterDate(ee.Date(start_date),ee.Date(end_date))

You can filter the collection by a Region of Interest (“ROI”) and get only images that lie within or intersect some polygon like so:

# Have the array of points saved as “roi”roi_ee = ee.Geometry.Polygon(roi)collection = collection.filterBounds(roi_ee)

Finally, you can filter for a specific metadata value by using filterMetadata. (You can also just use the generic “filter” to plug in any filter you’d like.) While this may not be a traditional search/query system, it has the same results in the end: you end up with an ee.ImageCollection object containing all the images that meet your filter criteria. In this way, you can search/query for Sentinel-2 products using Google EE.

Downloading

While it is possible to (essentially) query images, there is no way to directly download images straight from Google EE to your local machine. However, this is not in fact a huge downside, because you can export images to your Google Cloud Services account. (It’s also possible to export them to Google Drive, which may be preferable if you’re only exporting a few images at a time.) We will discuss details on this in a future article, since some complicated issues arise specifically when you’re trying to export big images or a large number of images, but the basic idea is straightforward.

First, you create an export object with ee.batch.Export.image.toCloudStorage:

export = ee.Batch.Export.image.toCloudStorage(image=img)

img, here, is your Google EE Image object. Then, you simply run export.start() to begin your download. You can then print export.status() to see the status of your export, or you can use the following code to get continual updates of your status:

while export.active():print(export.status(), end=”\r”, flush=True)

Alternatively, you can go to the Google Earth Engine Code Editor and click on the “Tasks” tab to see all your current exports.

Conclusion

By solving both our issues with Google Earth Engine, we were able to benefit from a major advantage: all the image processing takes place on the Google Earth Engine platform, not on your local machine! In our old method, we had to download all the raw products and then save a new image each step of the way as we processed it. With Google Earth Engine, on the other hand, the only thing you need to download is the final image at the end of your processing pipeline. And while Google Earth Engine is hardly the only cloud-based geospatial image processing platform out there, it (a) is free to use and (b) has a large community of active users. Regardless, shifting to cloud-based processing saved us a good deal of up-front infrastructure costs.

In our next article, we will compare and contrast different ways of removing clouds from images, with special focus on the method we ended up going with, a novel one that in practice relies on Google Earth Engine: cloudfree merging. We hope to see you then!

--

--