Monitoring Inland Surface Water Area using Google Earth Engine & Python

Kavyajeet Bora
8 min readMay 23, 2024

--

Introduction

In this article, we’ll explore the process of developing a powerful tool to monitor inland surface water areas using the Google Earth Engine (GEE) platform and Python. By leveraging geospatial data and advanced analysis techniques, we’ll unlock insights into water dynamics, contributing to environmental research and sustainable resource management.

Outline

We’ll create two essential components to visualize and analyse the inland surface water area over time:

Time-Series Plot:

  • We’ll generate a time-series plot that showcases the variation in total water surface area (measured in km2) across different years.
  • This plot will allow us to observe trends, seasonal patterns, and any significant changes in surface water over time.

GIF Animation:

  • To enhance the visual impact, we’ll create a GIF animation.
  • The animation will depict the dynamic shifts in surface water over the years, providing a captivating way to explore the data.

By combining these elements, your tutorial will empower readers to understand and interpret the temporal dynamics of inland water bodies. Let’s get started! 🌊📈🎥.

Tools Used

For this comprehensive tutorial, we will use Google Colab Notebook. It provides a cloud-based Jupyter notebook environment, allowing seamless collaboration, access to powerful hardware resources, and integration with Google Drive. Apart from that we will be using the following python library (which are pre-installed):

  1. Google Earth Engine Python API: It allows users to access, manipulate, and visualize geospatial data. With this API, you can explore the Earth Engine Data Catalog, handle collections, create plots and maps, and export results as GeoTIFFs1. It’s a powerful tool for working with large-scale geospatial datasets and conducting analyses related to climate change, weather prediction, and more.
  2. Geemap: It is a Python package for interactive geospatial analysis and visualization with Google Earth Engine (GEE). It enables users to analyze and visualize GEE datasets interactively within a Jupyter-based environment. Geemap is particularly useful for students, researchers, and existing GEE users transitioning from the GEE JavaScript API to the Python API

Note: All the above libraries area preinstalled on a google colab notebook instance so no need to install any additional packages.

Without any further delays, launch a new Google Colab instance and begin writing your code 🚀🔥

Prerequisites

Before you dive into this tutorial, you will require an account in google earth engine. If you don’t have any account, you can signup for free by following these steps:

You may need to create a new project if you don’t have any. Here is a step by step guide:

Set up your Earth Engine enabled Cloud Project | Google Earth Engine | Google for Developers

Import Python Packages

As a first step, we will import all the necessary packages:

import geemap
import ee ## google earth engine python api
import matplotlib.pyplot as plt
import pandas as pd
import subprocess

ee.Authenticate()
ee.Initialize(project='YOUR-GOOGLE-PROJECT-NAME')

Note: Before you can make requests to Earth Engine through a client library, you must authenticate and use the resultant credentials to initialize the Earth Engine client. Read more on: Authentication and Initialization

Area of Interest

You can either upload a geometry in GeoJSON format or directly choose the desired area of interest using this map tool:

Keene Map Tool

Create the polygon by placing points using a right-click of your mouse. Once you’ve digitized the last point, close the polygon as demonstrated above. After creating the polygon, copy the GeoJSON text as shown and paste it into a variable in your Colab notebook 🌐🖱️📏. Later convert this geojson data to earth engine geometry type (ee.Geometry) as shown:

geojson  = {
"coordinates": [
[
[
77.1200409,
11.5324541
],
[
76.9941101,
11.5006165
],
[
76.9701462,
11.4243672
],
[
77.0055084,
11.3198347
],
[
77.1529998,
11.4261642
],
[
77.1200409,
11.5324541
]
]
],
"type": "Polygon"
}
## Convert GeoJSON to ee.Geometry
geometry = ee.Geometry(geojson)

This geometric shape will serve as the clipping boundary for the raster image that we’ll download in the upcoming section.

Download the Data

To map the surface water we will download the “global surface water data” from google earth engine catalogue

JRC Global Surface Water Data: This dataset contains maps of the location and temporal distribution of surface water from 1984 to 2021 and provides statistics on the extent and change of those water surfaces.

To download the data for our area of interest:

image_collection = ee.ImageCollection("JRC/GSW1_4/YearlyHistory")\
.filter(ee.Filter.bounds(geometry))

This dataset has a “waterClass” property that classifies a pixel whether it is a surface water or not. It has 4 classes:

JRC Yearly Water Classification History, v1.4 available on Earth Engine Data Catalog

For this scenario, to classify a pixel to be water, we will consider both class 2 and 3 to be as water. So wherever we find water based on this criterion, we will set the pixel value to 1 and rest will be set to 0. To mask the image we will use the following function:

def maskSurfaceWater(image):
'''set the pixel value to 1 wherever water is found and rest will be 0'''
new_image = image.eq(2).Or(image.eq(3))
return new_image.copyProperties(image, image.propertyNames())

To apply the masking to every image in the collection, we will map this function to the image collection we previously downloaded:

## mask the area which are not surface water
masked_image_collection = image_collection.map(maskSurfaceWater)

Once the surface water pixels are classified we will visualize the results.

Visualize the results on interactive map

For visualizing the surface water data, we will use the geemap python library:

## Choose the first image of the year 2000 for visualizing the results
image_2000 = masked_image_collection.filter(ee.Filter.eq('year', 2000)).first()
isWaterImage = image_2000.selfMask()

Map = geemap.Map()
visParams = {'palette': ['blue']}
Map.addLayer(isWaterImage.clip(geometry), visParams)
Map.centerObject(geometry, zoom=11)
Map

Calculate the total surface water area

We will calculate the total surface water area in km² for each image in the image collection. There might be few images with no data which we will ignore it in later stage. Once we have calculate the area for each image, we will store the area value as a property in each image. Here is the code to calculate the area of each image:

def getWaterSurfaceArea(image):

areaImage = image.multiply(ee.Image.pixelArea())
totalArea = areaImage.reduceRegion(
reducer = ee.Reducer.sum(),
geometry = geometry,
scale=30,
maxPixels = 1e10
)

totalArea = ee.Number(totalArea.get('waterClass')).divide(1e6).round()

return image.set('area_km2', totalArea)

Map this function to calculate the area in each image:

## calculate the area of each image
areaImages = masked_image_collection.map(getWaterSurfaceArea)

Plot the timeseries

Once the area is calculated, we will now extract the area and the year values from each image to plot the time series. To extract the values, I have created a function:

def extract_time_series(image_collection, stat='area_km2'):
'''
Extract timeseries values, the area values and corresponding the year
'''

image_list = image_collection.getInfo()['features']
properties = [img['properties'] for img in image_list]

xs, ys = [], []
for prop in properties:
x,y = prop['year'], prop[stat]
xs.append(x), ys.append(y)

return xs, ys

Extract the values from the image collection and remove the invalid data points (if any). Since there were some images with no data, the function has area as 0 for those image, we will remove those images further in our analysis:

## Extract the timeseries data from image collection
xs,ys = extract_time_series(areaImages)
## Converting the data to a dataframe
df = pd.DataFrame(zip(xs,ys), columns=['year','area_km2'])

## Removing the null area values
df = df[df['area_km2']>0]

After extracting the area and year values, we will plot the timeseries using matplotlib. Here you can use any other python visualization library like seaborn, plotly etc:

def plot_time_series(x,y):
'''
Plot the time series given the x - year, y - area for example
and return a matplotlib figure
'''

fig,ax = plt.subplots(figsize=(15,3))
ax.plot(x,y)
ax.set_xlabel("Year")
ax.set_ylabel('Surface Water Area (km2)')
ax.set_title("Total Surface Water Area")
ax.set_ylim(0,max(y)*1.3)
ax.grid(axis = 'x')
return fig

Plot the timeseries:

## plotting
fig = plot_time_series(df['year'],df['area_km2'])

Export the plot as image:

fig.savefig('water_surface_area.png', bbox_inches='tight')

Create an animation and export as GIF

To create an animation from an image collection, we will use the function:ee.ImageCollection.getVideoThumbnail()

## Remove images with 0 area if any
image_collection = areaImages.filter(ee.Filter.gt('area_km2',0))

## Visualization parameters
visParams = {
'bands': 'waterClass',
'palette': ['blue']
}

## Create the image visuals
images = image_collection.map(lambda image: image.visualize(min=0, max=1, palette=['black','blue']).selfMask())

## Define GIF visualization parameters.
gifParams = {
'region': geometry,
'dimensions': 600,
'framesPerSecond': 1
}

## Download the gif
url = images.getVideoThumbURL(gifParams)
subprocess.run(["wget", url, "-O", "surface_water.gif"])

This function will return a gif:

While the outcome appears satisfactory, there is room for improvement. For example, labelling each image with the year it was taken will help to provide a better understanding when monitoring the surface water. To label the images, we can use the geemap function:geemap.add_text_to_gif

To do this, a list of text with each image’s label on it is needed. Keep in mind that the text sequence’s length must match the length of the collection of images. The code to create labelled gifs is provided here:

text_sequence=[f"Year: {year}" for year in df.year]
new_file_name = "new_surface_water.gif"

geemap.add_text_to_gif(
"surface_water.gif",
new_file_name,
xy=("3%", "5%"),
text_sequence=text_sequence,
font_size=30,
font_color="#ffffff",
duration=1000 ## in milisecond
)

This will generate this output:

Conclusion

In this article we have learnt how we can develope a monitoring tool to visualize the surface water area for any given area. Here we use the JRC global surface water dataset available on google earth engine data catalogue where the dataset contains pixels classified as either surface water or no water. Alternatively we can also use remote sensing indices like NDWI to detech surface water. Here is a notebook showing the results using NDWI. This method is not very accurate in detecting surface water but there are more advanced indices that are more accurate. Here is a list of those indices: Water spectral indices.

Here is the link to the github repository: monitoring_water_surface_area

You can also run the notebook on 🖥️google colab

Please click the “clap” button to show your appreciation for this article if you found it to be helpful.

References

  1. Spatial Thoughts: Surface Water Mapping
  2. geemap: Adding Text to GIFs generated from google earth engine

--

--