Scaling Satellite Imagery Analysis for Site Viability on Google Cloud

Published in

Contino Engineering

5 min readJul 28, 2021

This is a two-part blog post I’m writing with my colleague Simon Darr, so please read on and stay tuned.

Flood and moisture analysis of wind farm pads using NDWI index derived from Sentinel 2.

Scaling satellite imagery analysis comes in two flavours — scaling to multiple use cases and scaling to navigate billions of pixels. But why would you leverage satellite imagery to begin with?

The availability of satellite images has changed over the last several years. The number of satellites in orbit has increased, increasing the number of revisits along with the availability of high-resolution sources and spectral bands. Also, the cost of acquiring satellite imagery has decreased thanks to smaller satellites, reusable rockets, and the general evolution of the space industry from a government-funded program to a commercial enterprise. In short, this data source has become more accessible, more voluminous, and more meaningful to a wider range of use cases.

With this in mind, specialised tools are needed… but are they really? We certainly challenge this point. Specialist tools like QGIS should be used when needed but are not needed as often as one might think. Moreover, they can prove to be blockers in sharing insights beyond the specialists that uncover those insights, and they cannot easily aggregate temporal data nor handle large volumes.

We put together a pattern and demo that primarily leverages commodity tools like BigQuery and Looker to address, for example, site viability based on historical flooding.

Baseline model

Site viability of any asset is a challenging, multi-faceted endeavour. One key area for consideration is flooding. Determining the propensity for a site to flood, or where on a site is likely to flood, may aid in improving the construction, operation, financing, insuring and leasing of a site for industrial assets.

While topographic maps aid in assessing where flood zones are, they do not necessarily answer the questions — where did rainfall actually accumulate or even persist above and below ground? So, while topographic maps are useful for geospatial information (GIS) specialists to derive insights that are passed along to construction teams and business decision-makers, they are still limited in their scope and utility. In contrast, remote sensing can be invaluable in assessing questions that traditional maps can’t answer.

A baseline model that conveys an understanding of flood risk through satellite imagery is the Normalised Difference Water Index (NDWI) of a given area of interest (AOI).

NDWI analysis over three year period during summer months showing increased moisture in 2018.

NDWI is a combination of two spectral bands — green and near-infrared (NIR) — used to infer water content. When NDWI is calculated each pixel within an image is scaled between -1.0 and 1.0. The higher values indicate more water content. For example, values above 0.3 would indicate open water (e.g. large lakes and oceans generally where it is 20m or deeper) and the lowest values can be used to express water content in vegetation and even soil.

Improving the baseline

In the baseline mode, we leveraged Sentinel 2 data that is publicly available in a Google Cloud bucket that is maintained by Google. The resolution of these raster images is 10m per pixel for the green and NIR bands while the revisit is about every 5 days. This is by no means the remote sensing industry’s highest resolution nor highest revisit rate.

For example, Planet can provide the same spectral bands with a higher resolution of 3–5m per pixel at a daily revisit rate. Such improvements of the baseline offer an enhanced value as better insights can be derived from this data source.

Example of PlanetScope product on left while Google Earth is on right. Source: Blurry Satellite Images of Palestine and Israel Make Rebuilding Harder

Yet another lever to pull on with regard to improving the baseline is the idea of calibration. The definition of ‘flooding’ can be fluid (pun alert!) in the context of remote sensing. For example, water turbidity can impact the boundary between land and water, and inundated soil caused by flooding or heavy rainfall can be a costly problem despite not being strictly defined as flooding itself. With this in mind, there is a strong need for calibration, propensity scoring or additional metrics to enable an intelligent and automated site viability process.

This requires combined GIS, data science and data engineering smarts.

Beyond the baseline

It goes without saying but is still worth being said, there are many use cases beyond this one. Site viability might include something like vegetation encroachment assessment. Moreover, similar spectral analysis may be leveraged in monitoring of, or yield predictions for, crops and even environmental accountability. Also, object detection, separate to basic spectral analysis, can be used for predictive maintenance and more. If we choose to empower data scientists through non-specialists tools, this further broadens the possibilities.

Example of object detection with remote sensing data.

Moreover, varying resolution of data sources can be daisy-chained together to make more robust monitoring systems through techniques like tip-and-cue. And spectral satellite data is not the only tool in the shed, SAR and others offer yet more important data sources.

Scaling the data pipeline

As the volume of remote sensing data grows, so does the number of potential use case — each has its own nuances and challenges. Scaling to new and multiple use cases is becoming more of a reality. Moreover, it is further enabled by the ability to scale data pipelines to handle billions of pixels with minimal effort.

Part two, the follow-up blog post by my colleague Simon Darr, will discuss how we processed Sentinel 2 satellite raster images in Google Cloud’s Cloud Build and BigQuery along with visualising the output in Looker (a common, non-GIS-specialist business intelligence tool). Serverless and commodity pipelines and tools such as these help to handle the scale of this data source as well as insuring reusability across various user roles.

Stay tuned for the details!

What to know how we can help you to identify and implement a satellite imagery analytics use case? Let’s get in touch.

Scaling Satellite Imagery Analysis for Site Viability on Google Cloud

Baseline model

Improving the baseline

Beyond the baseline

Scaling the data pipeline

Written by Byron Allen