A Quick Guide to LiDAR: Part 2

Learn how to visualize LiDAR data with python and perform classification using height and intensity.

Namrata Dutt
5 min readMar 10, 2022

This part will give a demonstration of using LiDAR data for classifying different landcover classes. We will use the MUUFL Gulfport dataset for visualizing LiDAR data and perform classification based on height and intensity. The dataset that we are using is a .mat file. We can read the .mat file in python using the Scipy library.

Step 1: Import Libraries

Step 2: Read the file and extract LiDAR data and Ground Truth

The .mat file contains several MATLAB struct field names. The field called ‘hsi’ contains ground truth, lidar data, etc. The description of the .mat file can be read here. We will extract the LiDAR data, ground truth, and RGB image from their respective MATLAB struct fields.

Step 3: Plot Intensity and Height (Remove noise, if any)

Now, we have extracted the height and intensity of the LiDAR. When we plot the intensity, we see that the plot looks bad because of the noise. For the removal of noise, we first need to plot the histogram of intensity to figure out what values are causing the noise.

In the Histogram plot, we can see that there are very small values towards the end of the plot. So, we have to check where this long tail starts in the histogram.

In our code, this long tail starts around the value 223. Therefore, we need to remove anything above 223. For the removal of noise, we can either replace them with mean/median or zero. It is a better option to replace the noise with mean. After the removal of noise, we get a much clearer plot of intensity. The Height does not contain any noise.

LiDAR Intensity before noise removal (Image by Author)
Histogram of Intensity showing noise (Image by Author)
LiDAR Intensity after noise removal (Image by Author)
LiDAR Height (Image by Author)
Histogram of Height (Image by Author)

Step 4: 3D Visualization of Intensity and Height

3D Visualization of Intensity (Image by Author)
3D Visualization of Height (Image by Author)

Step 5: Plot Height and Intensity Histograms of the classes

We can plot the intensity and height of different classes and observe how each class is different. Although some classes will not be distinguishable using only height or intensity. But using a combination of height and intensity, we can correctly classify most of the samples.

To plot every histogram on the same scale, we set the ticks with the global minima and maxima of all the classes combined for both height and intensity. There are many classes in this dataset but for this tutorial, we have only used a few classes:

Trees(label= 1), Mostly Grass(label=2), Road(label=5), Building(label=8), Cloth Panels(label=11).

Histograms of Height for different classes (Image by Author)
Histograms of Height for different classes (Image by Author)

Step 6: Classification using KNN

Now comes the classification part. We will use KNN for the classification task. We set the value of K to be 9. You can experiment with different values of K and select the optimal value of K. We achieved an accuracy of 80.17% and the confusion matrix is displayed below.

Confusion Matrix (Image by Author)

In this Confusion matrix, we can see that many Buildings are classified as trees, that is because of their similar height. Sometimes Asphalt is used in buildings and also in roads, therefore some buildings are classified as roads due to similar intensities.

The complete code is available on GitHub here.

Conclusion

In this tutorial, we demonstrated how to visualize LiDAR data and classify it using KNN. We observed how height and intensity vary for different landcover classes. You can also use more sophisticated classifiers to get better accuracy.

In Part 3, I will demonstrate how to use the information from neighboring pixels as well for better classification. Right now, we are classifying single pixels so the classifier cannot distinguish between a few building and tree samples because of the same height. But if we will take into account an (NxN) region, we can also use the Standard Deviation as an important feature for classification. In the case of buildings, the standard deviation is almost zero whereas, in the case of trees, the standard deviation is a large value because of variability in height.

I hope you find this article useful!

Go Gators! 🐊

References

P. Gader, A. Zare, R. Close, J. Aitken, G. Tuell, “MUUFL Gulfport Hyperspectral and LiDAR Airborne Data Set,” University of Florida, Gainesville, FL, Tech. Rep. REP-2013–570, Oct. 2013.

X. Du and A. Zare, “Technical Report: Scene Label Ground Truth Map for MUUFL Gulfport Data Set,” University of Florida, Gainesville, FL, Tech. Rep. 20170417, Apr. 2017.

--

--

Namrata Dutt

Ph.D. Student at University of Florida | Interested in Image Processing, Machine Learning and Remote Sensing | Poetry Enthusiast