K-Means Clustering for Image Segmentation using OpenCV in Python

Ali Hassan
Towards Singularity
5 min readNov 16, 2020

Image segmentation is the process of dividing images to segment based on their characteristic of pixels. It helps us to analyze and understand images more meaningfully. The image segmentation has wide range of use cases, it used in the medical industry for efficient and faster diagnosis, detecting diseases.

We are using various image segmentation algorithms (Unsupervised) for grouping set of pixels which processing certain similarity. we are actually assigning labels to pixels and the pixels with the same label fall under same category.

Fig 1 : Segmented image with k=5

Some of Image segmentation use cases

  • Medical imaging: Image segmentation is considered the most essential medical imaging process as it extracts the region of interest (ROI) through a semiautomatic or automatic process. It divides an image into areas based on a specified description, such as segmenting body organs/tissues in the medical applications for border detection, tumor detection/segmentation, and mass detection
  • Object detection: Object detection is another interesting area in computer vision which help us to detect certain class of object in digital images and video. Well-researched domains of object detection include face detection and pedestrian detection.
  • Traffic control system: Traffic controlling is really challenging task for government today. With the increasing demand and production rate of vehicles the vehicle density increases day by day. Using computer vision we can do lot of automation stuff in this area like analyzing traffics, motoring traffic violation, controlling traffic and more
  • Video surveillance: Its really time-consuming process, the monitoring of all camera of a large cluster of security surveillance system. Video surveillance with computer vision capability help us to detect unusual pattern or activities which recorded in cctv footage.

K-Means Clustering

K-means clustering is a method which clustering data points or vectors with respect to nearest mean points .This results in a partitioning of the data points or vectors into Voronoi cells. When we applying k-means clustering algorithm to an image, it takes each pixel as vector point and building k-clusters of pixels. Let’s go through the Pseudocode algorithm.

  1. Choose the number of clusters(K) and obtain pixels
  2. Initialize K-means with random pixels
  3. Repeat steps 4 and 5 until convergence or until the end of a fixed number of iterations
  4. for each pixels Pi:
    1. Find the nearest Centroid
    2 . Assign the pixels to that cluster
  5. for each cluster Ci
    1. centroid = mean of all points assigned to that cluster
  6. End

Now Its Play Time!

First of all, we need to import required libraries for our image segmentation task. In our situation the problem requires lot of vector calculation. Default python data structures are not designed for high performance vector computation. So, we are using numpy (the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays). For plotting image, we are using matplotlib, it provides an object-oriented API for embedding plots into our notebook. And most important library in computer vision cv2, a library designed to solve computer vision problems. Please install those libraries if it is not already available in your system.

# Loading required libraries
import numpy as np
import matplotlib.pyplot as plt
import cv2

Now we need to pre-process our image to ensure the input pixel compatible to the cv2.kmean algorithm. So, we are reshaping image pixels to 2D array of RGB values and converting each color value to float 32 formats.

# load image from images directory
image = cv2.imread('images/lamborghini.jpg')

# Change color to RGB (from BGR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Reshaping the image into a 2D array of pixels and 3 color values (RGB)
pixel_vals = image.reshape((-1,3)) # numpy reshape operation -1 unspecified

# Convert to float type only for supporting cv2.kmean
pixel_vals = np.float32(pixel_vals)

Here we are using an inbuild function(kmeans) available in cv2 for clustering pixels of our image. So, we can go through the arguments of cv2.kmean. Mainly it has 5 arguments. It returning 3 values back which is compactness, labels and centers.

CV2.KMEANS Parameters

  • samples : It should be of np.float32 data type, and each feature should be put in a single column.
  • nclusters(K) : Number of clusters required at end
  • criteria : It is the iteration termination criteria. When this criteria is satisfied, algorithm iteration stops. Actually, it should be a tuple of 3 parameters. They are ( type, max_iter, epsilon ):
    - type of termination criteria. It has 3 flags as below:
    ( 1. cv.TERM_CRITERIA_EPS — stop the algorithm iteration if specified accuracy, epsilon, is reached.
    2. cv.TERM_CRITERIA_MAX_ITER — stop the algorithm after the specified number of iterations, max_iter.
    3. cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER — stop the iteration when any of the above condition is met).

    - max_iter : An integer specifying maximum number of iterations.
    - epsilon : Required accuracy
  • attempts: Flag to specify the number of times the algorithm is executed using different initial labelling. The algorithm returns the labels that yield the best compactness. This compactness is returned as output.
  • flags: This flag is used to specify how initial centers are taken. Normally two flags are used for this : cv.KMEANS_PP_CENTERS and cv.KMEANS_RANDOM_CENTERS.

CV2.KMEANS Return Value

  • compactness : It is the sum of squared distance from each point to their corresponding centers.
  • labels : This is the label array (i.e. labels which denotes which pixel belongs to which cluster)
  • centers : This is array of centers of clusters
#criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.85)

# Choosing number of cluster
k = 5

retval, labels, centers = cv2.kmeans(pixel_vals, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)

# convert data into 8-bit values
centers = np.uint8(centers)

segmented_data = centers[labels.flatten()] # Mapping labels to center points( RGB Value)

# reshape data into the original image dimensions
segmented_image = segmented_data.reshape((image.shape))

plt.imshow(segmented_image)

Now we are carried out the clustering process. Each individual pixel is assigned to one of five cluster, we can get each pixel’s clusters number from the labels list. For viewing segmented image, we need to construct the image using centers and labels. Now the centers are in form of float32 we need to convert it back to integer format. And also we need to convert labels to pixel values of respective centers, Luckily, we have an operation in numpy that can be used for constructing a numpy array, this operation can populates the numpy array according to selected indices. Example suppose we have a numpy array a= [1 2 3], so if we need to construct a numpy array of [1 2 3 2 1] we can easily populate using the expression a [[0 1 2 1 0]]. Here we are using this same concept to create pixels according to centers value and labels assigned (segmented_data = centers[labels.flatten()] ). Last step is reshaping the newly created numpy array to required format, here we are getting the shape of image by simply using shape attribute of image.

Links:

Jupyter notebook : Image_segmentation.ipynb

--

--

Ali Hassan
Towards Singularity

Frontend Development/ ML Engineering /Backend Development