K-Means Clustering for Surface Segmentation of Satellite Images

Maxfield Green
4 min readJul 9, 2020

--

Photo by USGS on Unsplash

In this story, I’ll be sharing an example use case of KMEansclustering in image segmentation tasks. The example is implemented in Python using three common packages.

With an abundance of available Satellite imagery, and a shortage of pixel-wise labelled semantic classes, unsupervised learning is an awesome way to leverage recent advances in remote sensing. In my work, I build learnable tools to help GIS analysts automate their ArcPro and eCognition workflow. The scientists and analysts that I work with are interested in questions such as how tree canopy shape and density change over time or how homeowners protect their property from wildfire damage. Much of this comes down to knowing how land type and use varies spatially. This can be easily formulated into a machine learning problem by asking what class each pixel in an image belongs too.

My contribution to this work often comes down to training models to perform semantic segmentation on imagery from all over the world. In cases with richly labelled data, a deep fully convolutional network (FCN), such as UNET, will usually do the trick. However, more often than not, imagery is not richly labelled, and if it was, I wouldn't have a job!

KMeans clustering is a simple and potentially very effective way to make a first pass at segmenting an image into k different classes, such as water, street, building … etc. However, as the algorithm clusters pixels into unknown classes, the classes in question are not controllable and are highly context specific.

In the generic clustering problem, we are given m examples, in this case, m pixels, whose classes are unknown. Each pixel is composed of several channels. Here we will stick with NAIP Imagery, which contains Red, Green, Blue and Near Infrared spectral layers. Our goal is to group the pixels based on how similar channel values are between different pixels of an image. The KMeans clustering algorithm is as follows:

Adapted from Deep Learning by Ian Goodfellow

Which is to say that, given k groups, we want assign every pixel in an image to one of the kgroups. This is done by iteratively assigning a pixel to a group based on the pixels distance to the groups center in feature space. In this case, the feature space is simply made of the floating points stored in each pixel band. Each groups “centroid” is updated to be the average position of all of the pixels assigned to the group. The end result is an image in which every pixels is assigned to one of k groups. The algorithm repeats until the the groups centers remain constant.

Lets walk through a simple implementation of this problem using Python.

This chunk of code performs three steps to produce two usable clusters in the 4 band feature space. First the data is loaded into memory, then the data is reshaped such that each column is a flat array of the digits belonging to a single band. Then clusters are computed using the Scikit-Learn implementation of the conical KMeans clustering algorithm.

Now that the positions of the two clusters have been calculated, we can use the trained model to classify each pixel from the original image, or any other image that the training imagery whose spectral distribution may represent.

Use clusters to predict the pixel classes and then compare output to input imagery

The interesting but potentially frustrating aspect of KMeans unsupervised learning, is that one cannot specify the classes the algorithm is working to cluster. KMeans will find positions within the feature space that can separate pixels into the different classes. However, the “classes” are not guaranteed to by impervious or non impervious. The clusters will simply represent groups that easily separable in feature space. For example in the Figures below, I present the first band from the imagery and the predicted class mask from KMeans.

When I look at this, I see that the algorithm has been able to detect some impervious surfaces, such as a few long stretches of road and sprawling concrete areas. The clustering starts to become less consistent in areas with heavy vegetation or tertiary classes, such as bodies of water. I've found KMeans to be a great starting point for working with unlabeled imagery data. In the past it has shown the promise of feature based separability. That is, if the analyst were to put time into labeling data from this environment, could a more power supervised model be successful at performing high grade classic semantic segmentation. While simple and perhaps naive, KMeans can help guide decision making when thinking about using labelled data heavy learning methods.

To gain a great intuition of what each step of the algorithm does, I recommend playing around with this online interface :

https://www.naftaliharris.com/blog/visualizing-k-means-clustering/

I’ll be writing a few more posts on different problems, solutions and stories that I’ve stumbled upon while working with the University of Vermont Spatial Analysis Lab!

--

--

Maxfield Green

interested in modelling complex systems to better understand our world.