Image Processing and Machine Learning

Ralph Caubalejo
Analytics Vidhya
Published in
5 min readJan 30, 2021

Using Image Processing Techniques to create a Machine Learning Dataset

(Image by Author)

Image processing has been used in several applications, from computer vision to text detection, object detection, and many more.

One of its applications is that that you leverage your image processing techniques to create a sample dataset for your Machine Learning algorithm.

For this article, we will show how we can use simple image processing techniques as a pipeline for our machine learning models.

Suppose you have a random image, where there is an object that you want to predict, how can we extract necessary features and information that we can use as training data for our machine learning models?

To show this let us load some simple user-generated images.

For this article, I used the following libraries:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from skimage.io import imread, imshow
from skimage.color import rgb2gray
from skimage.measure import label, regionprops
from skimage.filters import threshold_otsu
import warnings
warnings.filterwarnings('ignore')

IMAGE PROCESSING PART

We load our sample images:

#read image
sample1 = imread('number2.png')
sample2 = imread('number3.png')
#convert to gray scale
sample1_g = rgb2gray(sample1)
sample2_g = rgb2gray(sample2)
fig, ax = plt.subplots(1, 2, figsize=(15,15))
ax[0].imshow(sample1_g, cmap='gray')
ax[0].set_title('Sample 1',fontsize=15)
ax[1].imshow(sample2_g, cmap='gray')
ax[1].set_title('Sample 2',fontsize=15)
plt.show()
Figure 1: Sample Images (Image by Author)

Figure 1 shows our sample images, which is a user-defined and handwritten image of the number 2 and number 3. We will try to segment each of these numbers and extract different parameters that can feed to a machine learning model.

To start our pipeline, to easily extract features from the images, we should binarize the whole image. We can use Otsu’s Method to this,

#binarizing using otsu
thresh1 = threshold_otsu(sample1_g)
sample1_b = sample1_g < thresh
thresh2 = threshold_otsu(sample2_g)
sample2_b = sample2_g < thresh2
fig, ax = plt.subplots(1, 2, figsize=(15,5))
ax[0].imshow(sample1_b,cmap='gray')
ax[0].set_title('Binarized Image of Sample 1',fontsize=15)
ax[1].imshow(sample2_b, cmap='gray')
ax[1].set_title('Binarized Image of Sample 2',fontsize=15)
plt.show()
Figure 2: Binarized Image (Image by Author)

Notice that we were able to binarize the image perfectly by attenuating the numbers only within the image.

The next step would for us to label each of these white objects. We can use different blob detection for this step. To make it much easier, we can use a connected component function to isolate each number from the other. We can use a predefined labeling function from the scikit-image for this.

#using label to isolate each image
sample1_la = label(sample1_b)
sample2_la = label(sample2_b)
fig, ax = plt.subplots(1, 2, figsize=(15,5))
ax[0].imshow(sample1_la)
ax[0].set_title('Labelled Image of Sample 1',fontsize=15)
ax[1].imshow(sample2_la)
ax[1].set_title('Labelled Image of Sample 2',fontsize=15)
plt.show()
Figure 3: Labelled Image (Image by Author)

Notice in Figure 3 that we were able to label and detect each connected component in the image. The changing color means that it was able to detect each unique number.

To check for this, we can simply use the region props function to determine the total number of images seen. Visually we should be able to get 30 Numbers per sample.

sample1_r=regionprops(sample1_la)
sample2_r=regionprops(sample2_la)
print('Number of Data in Sample 1: ',len(sample1_r))
print('Number of Data in Sample 2: ',len(sample2_r))

Number of Data in Sample 1: 30

Number of Data in Sample 2: 30

Indeed! We are victorious in detecting each number of samples in each sample image.

We can also visually show how we were able to detect and isolate them. Some useful codes are as follows:

fig, ax = plt.subplots(1, 4, figsize=(15,10))
ax[0].imshow(sample1_la,cmap='gray')
ax[0].set_title('Labeled Images',fontsize=15)
for x,y in enumerate(sample1_r[:3]):
ax[x+1].imshow(y.image,cmap='gray')
ax[x+1].set_title('Sample 1_'+ str(x+1),fontsize=15)
plt.show()
Figure 4: Extracted Sample from Sample 1 (Image by Author)
Figure 5: Extracted Sample from Sample 2 (Image by Author)

Figures 4 and 5 show that we can detect and isolate each numbered sample perfectly without overlapping from one another.

Now that we were able to segment the objects/numbers, it is time to feature extract!

We can use the region prop functions different parameters to extract different measurements from the image. We can extract the area, perimeter, and other possible measured from each segmented image and put it in a Pandas Dataframe. A sample of code is as follows:

properties = ['area','convex_area','bbox_area','major_axis_length', 
'minor_axis_length', 'perimeter', 'equivalent_diameter', 'solidity', 'eccentricity','target']
df1 = pd.DataFrame(columns=properties)
count=0
proper = []
proper = []
y= int(0)
for x in sample1_r:
proper.append([x.area,x.convex_area,x.bbox_area,x.major_axis_length,
x.minor_axis_length,x.perimeter,x.equivalent_diameter,x.solidity,x.eccentricity,y])
df1.loc[count] = proper[0]
count +=1
proper = []

y= int(1)
for x in sample2_r:
proper.append([x.area,x.convex_area,x.bbox_area,x.major_axis_length,
x.minor_axis_length,x.perimeter,x.equivalent_diameter,x.solidity,x.eccentricity,y])
df1.loc[count] = proper[0]
count +=1
proper = []

df1.head(5)
Figure 6: Sample Head of Dataframe (Image by Author)

We can see on the sample head in Figure 6 that we were able to extract numerical measurements for each image. These measurements are based on the pixel values and bounding boxes derive from region props. We inputted also a target which 0 for Number 2 and 1 for Number 3.

MACHINE LEARNING PART

Now that we were able to create a simple data frame containing measurements of each image, and target also. We can already use it as our dataset for our machine learning model. A Machine Learning Codes are as follows:

We set the Data and Target as variables:

X = df1.drop('target', axis=1)
y = df1['target'].astype(int)

Since we saw that the image measurements from region props are quite varying per parameter, we should pass it through a scaler function, for this, we can use Standard Scaler as the scaler function.

from sklearn.preprocessing import StandardScaler
ssscaler = StandardScaler()
X_scaled = ssscaler.fit_transform(X)

After scaling, we should split the data into train and test to see if the model can generalize well using the parameters.

from sklearn.model_selection import train_test_split 
X_train, X_test, y_train, y_test = train_test_split(X, y,stratify=y)

After splitting, we feed it to a machine learning model. For this article, we will use a Random Forest Classifier with a max depth of 15 and an estimator of 500.

RF = RandomForestClassifier(max_depth=15,n_estimators=500) 
RF.fit(X_train,y_train)

After fitting the train set to the model, we should get the accuracy of our Classifier Model.

acc_train = RF.score(X_train,y_train)
acc_test = RF.score(X_test,y_test)
print('Train Accuracy at ',acc_train)
print('Test Accuracy at ',acc_test)

The results show a Test Accuracy of 80% which is already good considering that our chance criterion is only around 62.5%.

SUMMARY

To conclude for this article, we were able to use different image processing techniques to create a sample dataset for our machine learning model. We were able to use the segmentation and labeling function to create different parameters that can be used for a simple classifier model. With this, we can see that image processing techniques can be used as a way to extract different parameters from images.

Stay tuned for more articles!

--

--