JOURNEY INTO THE WORLD OF IMAGE PROCESSING (CHAPTER 5 OUT OF 9)

Color Image Segmentation using Python (Part 1)

7 min readMay 9, 2023

In this part of the journey, we will discuss how objects in the images can be segmented according to their color. This powerful preprocessing technique differentiates an image’s foreground and background. This topic will be in two parts: We will now discuss Thresholding, RGB Color Space, and HSV Color Space.

Throughout the discussion, we will use the following libraries.

import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread, imshow
from skimage.color import rgb2gray, rgb2hsv
from skimage.filters import threshold_otsu

Thresholding

We could get different properties from our image, as shown in the previous blog post. But to calculate that, we need to binarize our image (convert the values to 0 or 1).

The most crucial part of binarization is the selection of the threshold. From the grayscale image, we need to determine the boundary where values equal to or above are set to 1 while those below are set to 0. Previously we found this optimal threshold by doing multiple trials until we were satisfied that we had separated the foreground object from our background. But imagine having to do this multiple times if you have many images. This process would be time-consuming.

This problem is solved by doing Otsu’s Method. Conceptual what it does is it automatically tests different threshold values. The optimal threshold returned by this method is determined by maximizing the intercluster distance (difference in pixel value between objects and background) and minimizing the intracluster distance (difference in pixel value within objects and background). This is implemented in the skimage.filters module using the function treshold_otsu.

# Load Original Image
img_test1 = imread('img_test.jpg')
img_test1_gs = rgb2gray(img_test1)

# After several trial and error this is the best threshold
th = 0.80
img_test1_bn = img_test1_gs < th

# Using Otsu's Method
th_otsu = threshold_otsu(img_test1_gs)
img_test1_otsu = img_test1_gs < th_otsu

# Plot
fig, ax = plt.subplots(1, 3, figsize=(18, 6))
fig.subplots_adjust(wspace=-0.5)
ax[0].set_title("Original Image")
ax[0].imshow(img_test1)
ax[0].set_axis_off()
ax[1].set_title(f"Trial and Error (Threshold = {th:.2f})")
ax[1].imshow(img_test1_bn, cmap='gray')
ax[1].set_axis_off()
ax[2].set_title(f"Otsu's Method (Threshold = {th_otsu:.2f})")
ax[2].imshow(img_test1_otsu, cmap='gray')
ax[2].set_axis_off()

**Comparison of Trial and Error with Otsu’s Method in Image Binarization.** (Pineapple image by Fernando Andrade in Unsplash)

In the output, we can see that without multiple trial and error, the resulting threshold by Otsu’s Method could already differentiate the foreground and background. But this is not always the case.

# Load Original Image
img_test2 = imread('BP_ls.png')
img_test2_gs = rgb2gray(img_test2[:, :, :3])

# After several trial and error this is the best threshold
th = 0.95
img_test2_bn = img_test2_gs < th

# Using Otsu's Method
th_otsu = threshold_otsu(img_test2_gs)
img_test2_otsu = img_test2_gs < th_otsu

# Plot
fig, ax = plt.subplots(1, 3, figsize=(18, 6))
fig.subplots_adjust(wspace=-0.3)
ax[0].set_title("Original Image")
ax[0].imshow(img_test2)
ax[0].set_axis_off()
ax[1].set_title(f"Trial and Error (Threshold = {th:.2f})")
ax[1].imshow(img_test2_bn, cmap='gray')
ax[1].set_axis_off()
ax[2].set_title(f"Otsu's Method (Threshold = {th_otsu:.2f})")
ax[2].imshow(img_test2_otsu, cmap='gray')
ax[2].set_axis_off()

**Comparison of Trial and Error with Otsu’s Method in Image Binarization.** (Image of Blackpink light stick by Weverse in their official store)

Not all the time, Otsu’s method can reliably binarize the image. In the case shown above, the color of the hammer is pink, while the handle is black. The method recognizes that the pink color is closer to the white background color than the black handle. So by applying Otsu’s method, it recognizes that the pink hammer is part of the background rather than the object. We know this is not the case, so a trial-and-error method would be better.

Another limitation of this type of color image segmentation is that it can only be applied to single objects or multiple objects with similar colors. To address such issues, other methods could be used.

RGB Color Space

Remember from the first part of this Image Processing journey that our Image comprises three channels (Red, Green, and Blue). Therefore, we will segment the image according to the values of these color channels and adjust until the desired image segments are highlighted.

We will use an image of Blackpink in their Shutdown Music Video. The succeeding code will attempt to segment the outfit in the image.

# Laod Image
img = imread('BP_img.jpg')[:, :, :3]
img_gs_1c = rgb2gray(img)

# Plot
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
ax.set_title("Original Image")
ax.imshow(img)
ax.set_axis_off()
plt.show()

**Original Image.** (Image from Blackpink’s Shutdown Music Video)

If we want to highlight the red pixels in this image, we cannot simply filter out the pixels with high red values. If we do that, we could also get those white pixels (high red, high green, high blue) or other derivatives of red. Although we can visually see that a pixel is not red, it can still have value in its red channel because of having white. We must also set thresholds on other channels to properly segment our image. Trial and error could be made until satisfactory color segmentation occur.

# Grayscale image with 3 channels (the value is triplicated)
img_gs = ((np.stack([img_gs_1c] * 3, axis=-1) * 255)
          .astype('int').clip(0, 255))

# Red mask
red_mask = ((img[:, :, 0] > 150) &
            (img[:, :, 1] < 100) &
            (img[:, :, 2] < 200))
img_red = img_gs.copy()
img_red[red_mask] = img[red_mask]

# Green mask
green_mask = ((img[:, :, 0] < 190) &
              (img[:, :, 1] > 190) &
              (img[:, :, 2] < 190))
img_green = img_gs.copy()
img_green[green_mask] = img[green_mask]

# Blue mask
blue_mask = ((img[:, :, 0] < 80) &
             (img[:, :, 1] < 85) &
             (img[:, :, 2] > 50))
img_blue = img_gs.copy()
img_blue[blue_mask] = img[blue_mask]

# Plot
fig, ax = plt.subplots(1, 3, figsize=(21, 7))
ax[0].set_title("Red Segment")
ax[0].imshow(img_red)
ax[0].set_axis_off()
ax[1].set_title("Green Segment")
ax[1].imshow(img_green)
ax[1].set_axis_off()
ax[2].set_title("Blue Segment")
ax[2].imshow(img_blue)
ax[2].set_axis_off()
plt.show()

**RGB Color Segmentation overlayed with the grayscaled image.** (Image from Blackpink’s Shutdown Music Video)

We can see here that it is not perfect. We cannot capture all blue and all green in our image. No matter how you adjust the threshold, it will never perfectly segment the image. To solve this, we could also look at the HSV channel image.

HSV Color Space

Aside from RGB, images can also be represented by the HSV channel. A quick review of its meaning: Hue is the type of color, Saturation is the purity of color, and Value is the intensity of the color.

# Convert to HSV
img_hsv = rgb2hsv(img)

# Plot
fig, ax = plt.subplots(1, 3, figsize=(21, 7))
ax[0].set_title("Hue Channel")
ax[0].imshow(img_hsv[:, :, 0], cmap='gray')
ax[0].set_axis_off()
ax[1].set_title("Saturation Channel")
ax[1].imshow(img_hsv[:, :, 1], cmap='gray')
ax[1].set_axis_off()
ax[2].set_title("Value Channel")
ax[2].imshow(img_hsv[:, :, 2], cmap='gray')
ax[2].set_axis_off()
plt.show()

**HSV Channel representation of the Image.** (Image from Blackpink’s Shutdown Music Video)

We can notice here that the background and skin have low saturation values. We can use this saturation value as a filter later. What’s special in this image is that the green outfit is not showing as high saturation, but we also want to include this. We could add more masking to show only the greens.

# Plot Hue Channel with Colorbar
plt.imshow(img_hsv[:, :, 0], cmap='hsv')
plt.title('Hue Channel with Colorbar')
plt.colorbar()
plt.show()

**Hue Channel of the Image with Colorbar.** (Image from Blackpink’s Shutdown Music Video)

In this image, it is clearly seen the green outfit. Together with the green hue mask and the saturation value, we could now isolate all of the outfits in this image.

# Saturation mask
sat_mask = img_hsv[:, :, 1] > 0.35
img_hsv_mask = img_gs.copy()
img_hsv_mask[sat_mask] = img[sat_mask]

# Green mask (in Hue Channel)
lower_mask = img_hsv[:, :, 0] > 0.25
upper_mask = img_hsv[:, :, 0] < 0.59
green_mask = lower_mask * upper_mask
img_hsv_mask[green_mask] = img[green_mask]

# Plot
fig, ax = plt.subplots(1, 2, figsize=(15, 15))
ax[0].set_title("Original Image")
ax[0].imshow(img)
ax[0].set_axis_off()
ax[1].set_title("Image Segmented on Outfit")
ax[1].imshow(img_hsv_mask)
ax[1].set_axis_off()
plt.show()

**Comparison between Image Segmented on the Outfit vs. Original Image.** (Image from Blackpink’s Shutdown Music Video)

Lastly, we can combine those two masks to have our final color-segmented image on top of our grayscaled image. So now our final preprocessed image clearly shows the different outfits simultaneously. Something the RGB color channel might find trouble doing so. Although still doable by applying segmentation multiple times and then combining the results.

The limitation of the techniques we discuss in color segmentation is its sensitivity to light and less robust to color combinations. This will be addressed by doing RG Chromaticity color space, to be discussed in the 2nd part of this topic.

Key Takeaways

The image’s object and background can be separated by color segmentation. We could do trial and error in thresholding, yielding satisfactory results but not scalable if done on many images. Otsu’s method could automate the threshold determination, but sometimes it might fail in classifying the object and the background. When multiple objects are present in the image, we could use RGB and HSV color channels. They both work similarly, where you must find the right mask to isolate the object and apply it to the image.