Extracting Canopy Height with GEDI Data

Andi Brinn Thomas
4 min readFeb 21, 2020


By Andi Thomas, Emil Cherrington, and Kel Markert

The force is with us! The Global Ecosystem Dynamics Investigation (GEDI) instrument launched in 2018 for a 2-year mission, and forest researchers around the world have waited patiently for the data to be released. Up until now, it has been relatively difficult to estimate global forest biomass without expensive LiDAR instruments and massive amounts of time in the field to capture ground truth data. GEDI’s high resolution LiDAR data includes canopy height to estimate biomass, so we can record how tall forests are and hence how much biomass/carbon is stored.

Figure 1. GEDI lasers observe the 3D structure of Earth. Credit: gedi.umd.edu

On January 22, 2020, GEDI data was released to the public, and a few weeks later, the NASA LPDAAC provided a service that allows data filtering. This is when our team at SERVIR became real GEDI Knights. There was just one problem: in order to access the precise measurements of forest canopy height, canopy vertical structure, and surface elevation, we needed the data in a usable format. Upon GEDI data release, the canopy height estimations were included along with 1,145 separate fields. Some of us also prefer to use shapefiles or GeoTIFFs, rather than NetCDF (HDF5), the native file format for GEDI data. And one more thing before we get started: This Algorithm Theoretical Basis Document may be a helpful read in understanding GEDI on a more intimate level.

Figure 2. Here is an example of using the gedifinder to grab data based on a location.

How we processed data to retrieving canopy height:

Please note you will need Python 3 installed on your computer along with these packages:

● fire

● h5py

● glob

● tqdm

● numpy

● pandas

● geopandas

A. Discover and download GEDI data:

  1. Access: https://lpdaac.usgs.gov/news/release-gedi-finder-web-service/
  2. You will need coordinates that define your region ahead of time

i. Make sure to have coordinates for the exact region you need rather than country level specification. Being conservative in how large your area is will cut down processing time.

  1. Make sure you are signed into Earth Data in order to save the data
  2. Save the data to a folder on your desktop

i. May take a while to download depending on your internet speed

B. Open and visualize GEDI data:

  1. Save and open “gedi_to_vector.py” script from https://gist.github.com/KMarkert/c68ccf53260d7b775b836bf2e11e2ec3
  2. From the command line, RUN: python gedi_to_vector.py <path to directory> — variables [<var1>,<var2>,…,<varN>] — outFormat <extension> — filterBounds [<W>,<S>,<E>,<N>] — verbose

i. For example using GEDI_02b data: python gedi_to_vector.py /users/abthoma/Desktop/GEDI_files/ — variables [height_bin0] — outformat .shp — filterBounds [-3.684,4.6,2.109,11.738] — verbose #Extracting data over Ghana

  1. Note in the script: Beams are 8 different scans for every file and the script will take all of those beams and output to one table
  2. Now you should have tables with latitude, longitude and canopy height
  3. Extract No Data values in your favorite GIS software or perhaps with python
  4. Create feature class from x,y table in your favorite GIS software

i. This step will generate a shapefile (.shp) for you

Time to analyze! (See Figure 3. below for an example of GEDI data over Ghana)

Figure 3. Final result for canopy height extraction over Ghana.

Why is this important to SERVIR Global?

SERVIR uses satellite data to help developing countries use satellite data for better environmental decision-making. One of our focusing themes is Land Cover and Land Use Change over time. ll five SERVIR focus regions are studying forests, and with our new Applied Sciences Teams (ASTs) coming on board, there is continued need to update and enhance our methods for land cover mapping. We have 20 AST projects and 7 are focused on land cover and biomass estimation. In addition, these LiDAR datasets are extremely relevant to our joint efforts with SilvaCarbon to enhance the capacities of countries to estimate biomass and monitor forests. GEDI data will be useful to complement methods available in the joint SERVIR-SilvaCarbon SAR Handbook effort, particularly to leverage SAR datasets to estimate Forest Stand Height in large areas.

Before GEDI and ICESAT2 data, a paper came out, Hansen et. al 2013, that released a tree cover dataset researchers have widely used because it has been around for over 6 years and is a tool for quantifying global forest change. The dataset shows users how closed or open the canopy is, but we do not get tree height information needed. Tree height is difficult to sample without field work and most researchers do not have the resources needed to travel to remote forests. Thankfully, with the new GEDI dataset open to the public, we now have a new tool to monitor our forests as the climate changes. May the force be with all of our fellow forest researchers.

Special thanks to Kel Markert for your timely writing of the GEDI to vector script. Without you, this process would have taken way more time.



Andi Brinn Thomas

My opinions are my own and not the views of my employer. She/Her/Hers