Radiant Earth Foundation Releases the Benchmark Training Data “LandCoverNet” for Africa

LandCoverNet is an open-access global land cover classification training dataset with satellite image pixels labeled for seven land cover classes.

Radiant Earth
Radiant Earth Insights
4 min readJul 28, 2020

--

Radiant Earth Foundation is proud to announce the release of “LandCoverNet,” a human-labeled global land cover classification training dataset. Available for download on Radiant MLHub, the open geospatial library, LandCoverNet will enable accurate and regular land cover mapping allowing for timely insights into natural and anthropogenic impacts on the Earth. This release contains data across Africa, which accounts for ~1/5 of the global dataset.

Global land cover maps derived from Earth observations are not new. The European Union’s CORINE land cover map was initiated in 1985, while the United States Geological Survey has been issuing a land cover map based on satellite imagery every five years since 1992. These maps provide reliable information about the Earth’s surface and categorize any land changes that might occur. But the influx of open-access high spatial resolution Earth observations, such as that from the European Space Agency’s Sentinel missions, coupled with improved computer power, encouraged the development of advanced algorithms. Machine learning models applied to high resolution remotely sensed imageries can classify land cover classes more accurately and faster, given the availability of high-quality training data. As a result, applications that extract intelligence on agricultural productivity, urban structures, maritime monitoring, and other insights have emerged in the last decade. These successful efforts underscore the enormous potential of using machine learning to solve global development and humanitarian challenges, including the need for regularly updated land cover maps, which is essential to monitor and measure progress toward several Sustainable Development Goals (SDGs).

LandCoverNet is an annual land cover classification training dataset with labels for the multi-spectral high-quality satellite imagery from Sentinel-2 satellites, covering Africa, Asia, Australia, Europe, North America, and South America. Seven land cover class types are identified: water, natural bare ground, artificial bare ground, woody vegetation, cultivated vegetation, (semi) natural vegetation, and permanent snow/ice. The annual land cover classes are labeled based on 24 scenes of Sentinel-2 for each tile throughout 2018.

To generate the training data, Radiant Earth’s technology team selected 300 geographically diverse tiles of Sentinel-2 imagery spanning all continents to capture the diversity of land covers globally. Next, 30 image chips of 256 x 256 pixels at 10-meter spatial resolution were generated in each tile resulting in 9000 global chips of Sentinel-2 L2A observations. This amounts to 589 million pixels for the global land cover classification training dataset (130 million pixels for the Africa portion of the dataset). The team then built machine learning algorithms for each Sentinel-2 tile to generate a “guess land cover label” that was, in turn, independently validated by three different individuals using Sinergise’s Classification App.

Sinergise is a Slovenia based company that builds large-scale cloud-based geospatial systems for Earth observation products and GIS, among other tools. They provided in-kind technology support to the development of LandCoverNet by adding customized features to their open source Classification App that allowed our users to parse through Sentinel-2 images across time and then validate the labels. They also deployed the campaign dashboard based on the image chips that the Radiant Earth team provided. TaQadam, a social enterprise that employs refugees in Lebanon and Syria, ran the labeling campaign by deploying their trained users.

This first version of LandCoverNet, which contains image chips across Africa, provides high-quality training data for pixel-wise land cover classification and a consensus score to indicate the uncertainty in human interpretation of each class. Data scientists and practitioners can use LandCoverNet to develop new land cover classification models or validate their own models’ accuracy. Land cover maps created with LandCoverNet can also identify underrepresented areas where more data are needed.

Hamed Alemohammad, chief data scientist of Radiant Earth Foundation who leads the technology team, called LandCoverNet “a benchmark training dataset, which is necessary for developing and validating accurate and scalable classification algorithms across diverse geographies. Our focus on Africa adds to the geodiversity of global land cover models, a feat that only leads to balanced results.”

Schmidt Futures funded the development of LandCoverNet.

“We are incredibly grateful to Schmidt Futures for investing in this project,” said Anne Hale Miglarese, founder and CEO of Radiant Earth Foundation. “Their investment solidified the need to diversify training data geographically, leading Radiant Earth to focus its efforts on advancing the curation and sharing of geospatial training datasets to address complex challenges like food security and climate change through Radiant MLHub.”

“We want to congratulate the Radiant Earth Foundation for the progress that they have made in using satellite imagery and machine learning to tackle global development challenges,” said Thomas Kalil, Chief Innovation Officer for Schmidt Futures. “We hope that other foundations and philanthropists will support projects like LandCoverNet that harness machine learning to achieve the Sustainable Development Goals.”

LandCoverNet is distributed under the Creative Commons Attribution 4.0 license (CC BY 4.0) on Radiant MLHub. You can read the dataset documentation and access the example Jupyter notebook on the Radiant MLHub registry page. Radiant Earth Foundation is planning for the development and release of the data for the rest of the world.

--

--

Radiant Earth
Radiant Earth Insights

Increasing shared understanding of our world by expanding access to geospatial data and machine learning models.