Land use/land cover classification using Google Earth Engine, Random Forest Algorithm, Landsat and Sentinel data: the significance of image resolution

Laxmi Goparaju

Email: laxmigoparaju@vindhyabachao.org; goparajulaxmi@yahoo.com

Satellite data had offered the opportunity for decades to analyze real-world problems spatially and temporally. Still, the need was felt for rapid computation, large datasets, and accuracy of high-resolution satellite data. Cloud computing of satellite data and analysis has eased this herculean task which was time-intensive. Google Earth Engine (GEE) has developed a cloud computing platform with the availability of several datasets to address the challenges associated with big data.

Expanding urbanization and population has transformed the trajectory of land use and land cover. Infrastructure development and anthropogenic pressure have threatened natural resources and our environment. The boundaries of the forests have been encroached upon; fringes are converted to various land uses. It becomes vital that such changes are monitored at large spatial and temporal scales with less time-consuming analysis. Google Earth Engine (GEE) provides fast and accurate analysis and saves time.

Objectives:

1. To classify Landsat 8 data into major land use and land cover classes using Random Forest Classifier,

2. To classify the Sentinel — 1C data into major land use and land cover classes using Random Forest Classifier.

3. To classify combined (Landsat 8 and Sentinel 1C) data using Random Forest Classifier. To determine the level of accuracy in all three cases.

4. The state and district boundaries were used to clip the boundary of the land use and land cover.

5. Area statistics were calculated for each land use and land cover; in each district.

In this exercise, you will learn how to acquire Landsat and Sentinel Images from Google Earth Engine. How to train data for Random Forest? Perform Random Forest Classification and compute the accuracy. How to use the state boundaries, district boundaries and clip the Land use and Land cover data? Finally, export the image as well as data into Google drive as TIFF and CSV file.

A workflow for the exercise

  1. The first step is to load the Sentinel -1 database. Apply filter for images that are in Interferometric Wide Swath Mode (IW), Descending Pass, 10-meter resolution, with VV polarization. In the next step again repeat the first step with an exception to “VH polarization”.

2. In this step we filter the images by date range Filter by date range. In the top menu utilize the “RUN” in the menu bar to execute the program. Add the “VV” and “VH” images which were identified in the previous step, so that it is added to the “layers” bar for the convenience of visualization.

3. This function uses the Quality Assessment (QA) band of Landsat 8 SR. Extract the images from the Landsat 8 Surface Reflectance (SR) Tier 1 Collection. Then calculate the NDVI and add it as an extra band to the Landsat image selected. Add the Landsat image to the “layers” bar in order to then visualize the images.

4. The SAR image needs a speckle filter to be applied and then display. Use Map.addLayer to add the layer of filtered images onto the map panel. Thus the images are added in the “layers” bar.

5. In order to do the supervised classification, one must collect training data to “train” the classifier. This process involves collecting representative samples of backscatter for each class of land cover. Click only the VH image of SAR and display it. Go to the Geometry imports box besides geometry drawing tools and click +new layer. Select it to draw a polygon. One can select as many polygons as needed to define the particular class. Rename the geometry layer as the name of the class like open_water. Configure the open_water geometry import (cog-wheel, top of the script in imports section). Click the cog-wheel icon to configure it, change Import as from Geometry to Feature Collection. Use Add property land cover and set its value to 1. Subsequent classes will be 2, 3, 4 etc.). The next step is to merge them into a single collection, called a Feature Collection.

6. The SAR image will be first classified.

7. Then follows the display of the classification. Use Map.addLayer to add the layer, the classification image on to the map panel. Create a confusion matrix.

8. Next step is to train the Landsat data that is the optical data.

9. Display the classification layer using “Map.addLayer” and compute the confusion matrix.

10. Define both the optical data and the SAR data and train the data for classification.

11. Display the classified data (Optical — SAR). Use Map.addLayer to add the layer, the classification image on to the map panel. Compute a confusion matrix and export the image as TIFF.

12. To overlay the Administrative boundaries: The FAO GAUL: Global Administrative Unit Layers 2015, Second-Level Administrative Units dataset was utilized for this part of the analysis. This is a global dataset that consists up to Level 2 (districts/counties/…) boundaries. In this case, I filtered it and used the district boundaries of Uttar Pradesh State in India.

13. Area calculation of each class in each district

In this part, we need to calculate areas by class for the whole state. Incidentally, the whole state was not covered by this scene which was earlier classified. One can achieve this by applying map() on the Feature Collection to obtain the values by each district geometry. All the classes may not be present in each district. Each feature will have a different number of attributes relating to the number of classes.

Some post-processing needs to be done in order to extract the individual dictionaries and get results in a good format. An important function in GEE called flatten () plays an interesting role. It is a function required for data processing. It takes a nested list and transforms it into a flat list. After this, we can take the results of the grouped reducer and map a function over it to extract the individual class areas and convert them to a single dictionary. It is vital to make a note that the dictionary key must be of type “string’. Since our keys correspond to class numbers we will use the format () method to convert the number to a string.

14. Export the images as GeoTiFF to your google drive folder by using the selectors argument for Export.table.toDrive() function. Export the statistics (district-wise land use and land cover) as a CSV file. One thing to note is that each district may or may not have all of the 7 classes present. So each feature will have different number of attributes depending on which classes are present. We can explicitly set the expected fields in the output using the selectors argument for Export.table.toDrive() function. Because we need to use the list of output fields in the Export function we have to call getInfo() to get the list values on the client-side.

15. This result will be a homogeneous table with all classes. Once done, we can export the results to a CSV file.

16. The CSV file can be downloaded and is compatible with Microsoft Excel, which helps it to convert into an excel file for further analysis and computation.

The results of this analysis have been presented in a Twitter conference by FLARE (Forests & Livelihoods: Assessment, Research, and Engagement) Network Twitter conference (October 26th to 29th 2020). The details of the slides can be accessed at @FLAREglobal #FLARETC20 and http://bit.ly/FLARETC20 and https://twitter.com/FLAREglobal/status/1321458295568257029. Figures 1, 2 and 3 are the outcomes of this exercise.

Figure 1

Figure 2

Figure 3

Acknowledgments

· NASA ARSET Training Programme

· https://spatialthoughts.com/2020/06/19/calculating-area-gee/

--

--