(co-authored by Zhenya Warshavsky)
Searching for, downloading and displaying 2.5 million square kilometers of satellite imagery is no small task. But Project Canopy, a new NGO focusing on evidence-based policymaking for the Congo Basin rainforest, orgnaizations need a comprehensive overview of the rainforest in order to make effective decisions. As the NGO’s data scientists, we have been working over the last nine months to create such a prototype: a near-real-time logging road detection and notification system, powered by machine learning, and available across the entire Congo Basin rainforest.
Our previous article described how we went about deciding which platform to use in order to (a) query for specific Sentinel-2 satellite image products and (b) actually download those images. We ended up rejecting Google Earth Engine because we could not find a way to easily and reliably accomplish both those tasks. However, after further research, we found out we were mistaken; though somewhat counter-intuitive, there is indeed a way to (relatively) easily both query and download satellite images using Google Earth Engine (Google EE). This short post will explain how.
While there is no explicit “query” function in the Google EE API, there is a way to replicate it: by filtering an
An Image Collection, as the name implies, is a group of
ee.Image objects. While you can just directly create a Collection from a list of Images, it’s also possible to pass in the name of a folder in the Google EE database. When you do that, you’ll get a collection of every image in that folder. Specifically, if you want Sentinel-2 images, create one of the following
ee.ImageCollection objects, depending on if you want L1C or L2A (see our first post for more on this difference):
collection = ee.ImageCollection(“COPERNICUS/S2”) #L1Ccollection = ee.ImageCollection(“COPERNICUS/S2_SR”) #L2A
Of course, you’ll likely not want all the Sentinel-2 images. But after creating the collection, you can then filter it, to drill down to the specific images you want. For example, you can pull out only images in a specific timeframe like so:
start_date = ‘2020–01–15’ # can be any dateend_date = ‘2020–02–15’ # can be any datecollection = collection.filterDate(ee.Date(start_date),ee.Date(end_date))
You can filter the collection by a Region of Interest (“ROI”) and get only images that lie within or intersect some polygon like so:
# Have the array of points saved as “roi”roi_ee = ee.Geometry.Polygon(roi)collection = collection.filterBounds(roi_ee)
Finally, you can filter for a specific metadata value by using
filterMetadata. (You can also just use the generic “filter” to plug in any filter you’d like.) While this may not be a traditional search/query system, it has the same results in the end: you end up with an
ee.ImageCollection object containing all the images that meet your filter criteria. In this way, you can search/query for Sentinel-2 products using Google EE.
While it is possible to (essentially) query images, there is no way to directly download images straight from Google EE to your local machine. However, this is not in fact a huge downside, because you can export images to your Google Cloud Services account. (It’s also possible to export them to Google Drive, which may be preferable if you’re only exporting a few images at a time.) We will discuss details on this in a future article, since some complicated issues arise specifically when you’re trying to export big images or a large number of images, but the basic idea is straightforward.
First, you create an export object with
export = ee.Batch.Export.image.toCloudStorage(image=img)
img, here, is your Google EE Image object. Then, you simply run
export.start() to begin your download. You can then print
export.status() to see the status of your export, or you can use the following code to get continual updates of your status:
while export.active():print(export.status(), end=”\r”, flush=True)
Alternatively, you can go to the Google Earth Engine Code Editor and click on the “Tasks” tab to see all your current exports.
By solving both our issues with Google Earth Engine, we were able to benefit from a major advantage: all the image processing takes place on the Google Earth Engine platform, not on your local machine! In our old method, we had to download all the raw products and then save a new image each step of the way as we processed it. With Google Earth Engine, on the other hand, the only thing you need to download is the final image at the end of your processing pipeline. And while Google Earth Engine is hardly the only cloud-based geospatial image processing platform out there, it (a) is free to use and (b) has a large community of active users. Regardless, shifting to cloud-based processing saved us a good deal of up-front infrastructure costs.
In our next article, we will compare and contrast different ways of removing clouds from images, with special focus on the method we ended up going with, a novel one that in practice relies on Google Earth Engine: cloudfree merging. We hope to see you then!