Cloud Score+ in Action: Land Cover Mapping in Ecuador

Published in

Google Earth and Earth Engine

9 min readMar 27, 2024

By Andréa P. Nicolau, Geospatial Data Scientist, Spatial Informatics Group, Earth Engine Google Developer Expert

Monitoring the Earth’s surface has always depended on the availability of high-quality cloud-free imagery. Now, imagine trying to map and monitor land cover and land use in the world’s cloudiest spots. Nightmare, right? But fear not, with the recent launch of Cloud Score+ for Sentinel-2, the clouds are parting ways for clearer composites. In this post, we’ll dive into a case study that shines a light on how Cloud Score+ is revolutionizing land cover mapping in Imbabura, Ecuador.

Study area: Imbabura, Ecuador

The beautiful province of Imbabura, located in the northern Andes of Ecuador (Figure 1), is known throughout the country as the Province of the Lakes due to its abundance of lakes, lagoons, and water resources. Building upon this fame, the entire province of Imbabura has been designated a Global Geopark by UNESCO, in 2019, due to its unique geological features, including volcanic complexes, geological faults, and mines, as well as its rich natural, cultural, and heritage assets. It is currently the only geopark in the country. Imbabura has a large Indigenous population and, therefore, a big portion of its population speaks Quichua, including authorities in its Provincial government.

Figure 1: Imbabura Province, Ecuador, and some highlighted areas.

The Decentralized Autonomous Government (GAD, in Spanish) of Imbabura has long relied on moderate-resolution optical-based land use land cover (LULC) maps produced by the national government to measure and monitor the province’s landscape. Yet getting a clear view of Imbabura is more challenging than one might think. Like many equatorial countries, Ecuador faces persistent cloud cover throughout the year (Figure 2), which greatly reduces the number of usable observations from satellite imagery. Undetected clouds and shadows have also led to misclassifications of land cover land use classes in some areas of existing national maps. All of this has caused problems in land management and resource management in certain areas of the province, where the lack of information hinders the implementation of programs and projects that are in line with the territorial reality.

Figure 2: One year of imagery (2019) from an area of interest in Imbabura (79.1599 W, 0.3299 N).

Land cover mapping in Imbabura

Seeking a solution, GAD Imbabura has teamed up with EcoCiencia and Spatial Informatics Group through a collaboration with SERVIR-Amazonia to leverage the cloud-computing power of Earth Engine and improve existing capacities to measure and monitor the province’s land cover and land use for forest management, risk management, and governance. The pilot piece of the project was to produce a 2019 LULC map using the same class schema as the national maps (Figure 3), employing a traditional Random Forest model to serve as a baseline. The typology adopted by the national authorities includes the following LULC classes (translated from Spanish):

Forest
Natural (Water bodies)
Shrublands
Croplands and Pasturelands
Moorlands
Urban areas (buildings)
Bareland
Forest plantations
Infrastructure (e.g. asphalt)

Figure 3: 2019 Land Use Land Cover Classification of Imbabura (the final output of the process described in the following sections).

A cloud cover conundrum

Seeing through the clouds?

The initial idea to mitigate the cloud cover issue was to make use of Synthetic Aperture Radar (SAR) data, given SAR’s ability to penetrate through the clouds. But, since Imbabura also presents varying topography, with some areas above 400 m of elevation, we knew that the Sentinel-1 GRD data available in Google Earth Engine, speckle filters, and Terrain Normalization algorithms (Vollrath, et al., 2020; Hoekman & Reich, 2015) could not completely solve this problem. Due to the side-looking nature of SAR sensors, images show many geometric distortions due to layover, foreshortening, and shadowing in mountainous regions. After our pre-processing, the resulting images presented more gaps, shadows, and other artifacts than we anticipated.

Cloud Score+ to the rescue!

Luckily, we got early access to the new Cloud Score+ QA dataset for Sentinel-2 (read more about it here) and could test it out for our area of interest. The first time we laid our eyes on a 2019 Cloud Score+ composite of the province generated using the default threshold value of 0.6 for the cs band, we were shocked to see how clear it was! Compared to other composites we had created using the Sentinel-2 Quality Assessment (QA60) band and the s2cloudless algorithm, clearly, the Cloud Score+ output was the big winner (Figure 4). For one of the landmarks of the Province, the Cotacachi volcano, only Cloud Score+ was able to retrieve usable pixels. Both QA60 band and s2cloudless masked out pixels at this important landscape where the provincial government has to monitor croplands and forest plantations in its vegetated surroundings (Figure 3).

Figure 4: 2019 composites (QA60 band, s2cloudless, Cloud Score+). The zoomed-in area is over the Volcano Cotacachi. Note differences in pixel quality and masking across the three methods, with Cloud Score+ providing the clearest views.

Classification improvements

Interested in understanding the impact of the cloud masking approach on the classification results, we compared the three different masking approaches (QA60 band, s2cloudless, Cloud Score+ with a 0.4 cs band threshold) by running our mapping workflow (Figure 5) and just swapping the Sentinel-2 composite used. Our general workflow (Figure 5) depicts a simple random forest approach using multisensor data, including local Digital Terrain Models, and local reference data provided by the Ecuadorian Ministry of Agriculture. The reference data was split into 80% for training and 20% for testing. We compared overall, user’s, and producer’s accuracy using the three different Sentinel-2 composites to quantify a level of model performance improvement over cloud masking approaches.

Figure 5: Mapping workflow. Cloud masking was done using either QA60 band, s2cloudless, or Cloud Score+.

Comparing accuracy results, we noticed a 2% improvement in overall accuracy between the QA60 band (71%) composite and the CS+ composite (73%) and a 1% improvement between the s2cloudless (72%) composite and the CS+ composite. The user’s and producer’s accuracy results are summarized in the table below (Table 1). In green are the results where the Cloud Score+ output performed better than both other outputs; in yellow, performed better than at least one other output; in red, performed worse than the other two; in black, no difference between the three. More noticeable, is the 5%+ improvement in producer’s accuracy for the bareland class. In general, just by swapping your cloud masking approach, you can have an improvement in the quality of your classification. These results are solely based on testing the model performance using reference data provided by the local government “as is” — without any quality control or quality assessment protocol. For this year, we are planning to conduct an independent accuracy assessment with elected sampling design, visual interpretation of high-resolution imagery, and a field campaign to assess what is actually on the ground. This way we will have a better understanding of the “real” accuracy of each map, and perhaps why we have the differences we see in Table 1 for specific classes.

Table 1: User’s and Producer’s Accuracy results. In red color: where the Cloud Score+ output performed worse than the other two outputs; In yellow color: where the Cloud Score+ output performed better than at least one of the two outputs; In green color: where the Cloud Score+ output performed better than both other two outputs. In black color: no difference between the three outputs.

Cloud Score+ threshold selection

You might be wondering why we ended up using a threshold of 0.4 for the cs band. Before selecting this threshold for our Cloud Score+ composite, we conducted a sensitivity analysis where we looked into testing different thresholds not only for this band but also for the cs_cdf band. We then looked into comparing the cs-based composite against the cs_cdf-based composite to understand what are the advantages and/or disadvantages of picking one of the two. We started off by visually comparing the composites created with 0.4, 0.5, 0.6, and 0.7 cs thresholds (example in Figure 6). As expected, the lower the threshold, the more cloud artifacts were presented in the composite; and the higher the threshold, the more masking was presented, resulting in a loss of useful information. The 0.5 and 0.6 thresholds seemed both a good balance between masking the correct pixels and losing useful information, so we looked further into comparing these two composites with a 0.55 threshold composite. There was not a unique winner for these three cases, the differences were very nuanced, meaning that in some areas of the composite, one composite was better than the others, whereas, in other areas, the other ones were better than the previous. We could continue going down this rabbit hole, looking further into the [0.55–0.60] range, but for our particular area of interest, we concluded that the 0.6 composite could be sufficient in terms of the aforementioned tradeoffs. We ended up selecting the 0.4 composite because our goal was to cover most of the province area as possible, so we afforded having some cloud artifacts in our composite in exchange for an almost-complete wall-to-wall coverage of the province for a land use land cover classification.

Figure 6: Cloud Score+ cs band-based composites over Las Golondrinas and the northern part of Imbabura

Band choices: cs or cs_cdf?

We did the same for the cs_cdf band, but in order to compare across these two bands, we visually analyzed a composite created with a 0.6 cs threshold and a composite created with a 0.6 cs_cdf threshold. As expected, cs is more sensitive to haze and cloud edges whereas cs_cdf is less sensitive to these low-magnitude spectral changes as well as terrain shadows. In fact, the cs composite showed a staggering ~180% percent difference in masked pixels compared to the cs_cdf composite (example in Figure 7). Overall, the masked pixels by cs were the “correct choice” compared to the quality of the unmasked pixels by cs_cdf. The unmasked pixels in the cs_cdf composite showed a lot of cloud and cloud shadow artifacts. This rule applied to most of the province, with the exception of one area. This area is the southeastern part of the province, and it behaved as the opposite because it presents the extreme of two conditions: high cloud coverage plus high elevation compared to the rest of the province. In this particular area, the cs_cdf composite seemed visually better than the cs composite, preserving useful information that cs had masked.

Figure 7: `cs` composite vs. `cs_cdf` composite over the northern part of Imbabura.

What’s next

In December of last year, GAD Imbabura hosted a launch event in Ibarra, the capital of the Imbabura province, where the first results of the work carried out were presented. The event was attended by government officials, Indigenous communities, and civil society interested in spatial information management for decision-making (a recording of the event is available here). The scripts and instructions have been added to a GitHub repository, and the final 2019 LULC map can be viewed in an Earth Engine app. With the repository and the complete Cloud Score+ Sentinel-2 archive, the provincial government will independently continue the work from now on. The objective of the GAD Imbabura is, first, to improve the proposed model through observations and data collection in the field. Second, to generate annual maps and identify changes in land use at the provincial level. Finally, to train and strengthen the work of officials from other local governments in the generation and use of this type of information. Within SERVIR-Amazonia, we have started to test out Cloud Score+ in other projects, and together with Google, we invite all readers to do the same in their own workflows. We are confident that Cloud Score+ can continue to contribute to accuracy improvements just from better cloud and shadow detections.

Check out this Earth Engine app to explore the final 2019 LULC map.

Acknowledgments

This blog post was reviewed by Fernanda Avellaneda (Deputy Director of Territorial Planning of Imbabura’s Prefecture) and Rodrigo Torres (Coordinator Geography Unit, EcoCiencia).