A Quick Guide to LiDAR: Part 3

4 min readJun 25, 2022

Learn to perform robust landcover classification using NxN neighborhood around a pixel.

Introduction

In the last part, we learned how to do landcover visualization and classification using height and intensity. In this part, we will explore how to perform a more robust classification while taking into account an N x N neighborhood around a pixel. The neighborhood region will give us a better idea of height and intensity variation around that point. If the pixel belongs to a grass or road, the height variation in the neighborhood region will be less. However, if the pixel belongs to trees, then the height variation will be significantly large. This way we can better estimate to which class a particular belongs.

Data

We used MUUFL Gulfport dataset. The data is in a .mat file. We can read the .mat file in python using the Scipy library. In this part, we will learn how to do landcover classification using the neighborhood region of NxN surrounding a pixel. Let’s get started!

Implementation

Step 1: Import Libraries

Step 2: Read the file and extract LiDAR data and Ground Truth

The .mat file contains several MATLAB struct field names. The field called ‘hsi’ contains ground truth, lidar data, etc. The description of the .mat file can be read here. We will extract the LiDAR data, ground truth, and RGB image from their respective MATLAB struct fields.

Step 3: Find the NxN neighborhood around each pixel

Step 4: Modify the ground truth

We are taking an 11x11 neighborhood region around a pixel. The ground truth contains labels -1, 1, 2, …, and 11. ‘-1’ is unlabelled data. We need labels 1, 2, …, and 11. Now, we need to encode the labels from 0 to 10. To do this, we simply subtract 1 from the ground truth labels.

Step 5: Split the data into train-test, save in npz file and load the file

We split the data into train and test and shuffle the train-test data, so we can get a random set of train and samples.

Later on, we will perform a Monte Carlo experiment and report average accuracy in the end. For that, we need random train and test samples in each experiment.

Step 6: Normalize train and test data

Step 7: Balance the train data

Since the data is unbalanced, we print all the labels and their number of samples. We set a threshold of 4000. All the labels that have a frequency of less than 4000 are repeated, keeping in check that they don’t exceed 4000. If the frequency of certain labels is less than 4000, their indices are repeated and added to useful_indices and if they are already greater than 4000, they are simply just added to useful_indices.

Step 8: One-hot encode the labels

Step 9: Define a CNN model for classification

Step 8: Train the model, predict on test data and report accuracy

We have performed Monte Carlo with 7 iterations of the same code. Each time we took a random set of train and test samples. We achieved an average accuracy of 87.65% ± 3.08. The confusion matrix for best accuracy is shown below:

The complete code is available on GitHub here.

Conclusion

In this part, we learned to classify landcover classes using an 11x11 neighborhood region. The neighboring pixels give a better idea of height and intensity variation around the region of interest. We can experiment with different patch sizes since the function for patch size is generalized. Taking patches into consideration gives better accuracy than single pixels. We still get some confusion between ‘Mostly grass’ and ‘Mixed ground’, ‘Building shadow’ and ‘Trees’, ‘Sidewalk’ and ‘Road’, etc. However, these confusions are small.

I hope you find this article useful!
Go Gators! 🐊

References

P. Gader, A. Zare, R. Close, J. Aitken, G. Tuell, “MUUFL Gulfport Hyperspectral and LiDAR Airborne Data Set,” University of Florida, Gainesville, FL, Tech. Rep. REP-2013–570, Oct. 2013.

X. Du and A. Zare, “Technical Report: Scene Label Ground Truth Map for MUUFL Gulfport Data Set,” University of Florida, Gainesville, FL, Tech. Rep. 20170417, Apr. 2017.

GitHub - GatorSense/MUUFLGulfport: MUUFL Gulfport Hyperspectral and LIDAR Data: This data set…

MUUFL Gulfport Hyperspectral and LIDAR Data: This data set includes HSI and LIDAR data, Scoring Code, Photographs of…

github.com

Mlearning.ai Submission Suggestions

How to become a writer on Mlearning.ai

medium.com