10 Fruits Classification using traditional Machine Learning for beginner

A beginner’s step into the field of computer vision with scikit-learn.

6 min readAug 18, 2020

Why should you read this normal article? I’m just an amateur, if you are a beginner like me thinking of starting (or already struggling with) a similar project, you may find this article helpful.

Said that, if you happen to be an expert in this field, or you just have some better ideas on ways to accomplish certain things in this project, be sure to let me know as well.

Let’s push on a little further and get right down to this project.

Source Code and Datasets

The source code and datasets used in this project can be found at the link given below.

Source code and datasets: Fruits Classification

Data gathering and Exploratory data analysis

Data gathering: The photos in this datasets were taken at a supermarket near my dorm using smartphone.
This dataset includes 2500 images about 10 types of fruit. The dataset is then divided into folders, with each folder named after each fruit, for example: folder named tomato consist of images about tomato.
Each fruit type has 250 images, file name format: <name fruit>_<index>.jpg, example: tomato_0.jpg.
Let’s see our 10 types of fruits.

Traditional ML: Feature Extraction

Convert original image to an image having 2 colors using K-Means algorithm

In the steps below, I just handle in one image, and then apply this method to all images.

Example original image:

After load image, we need to resize and convert image into RGB color space (because cv2.imread read image in BGR color space)

I decide to resize the image to the size 200x200, It is not required to resize it to a smaller size but we do so to lessen the pixels which’ll reduce the time needed to extract the colors from the image. Also reduce RAM capacity.

fruit_img = cv2.resize(fruit_img, (200, 200))fruit_img = cv2.cvtColor(fruit_img,cv2.COLOR_BGR2RGB)

Next, convert MxNx3 image into Kx3 matrix where K=MxN and each row is now a vector in the 3-D space of RGB

vectorized = img.reshape((-1,3))

Convert unit8 values to float as it is a requirement of the k-means method of OpenCV

vectorized = np.float32(vectorized)

We are going to cluster with k=2 because we just need color in our fruit and image’s background.

OpenCV provides cv2.kmeans(sample, ncluster(k), criteria, attempts, flags) function for color clustering.

1. samples: It should be of np.float32 data type, and each feature should be put in a single column.

2. ncluster(K): Number of clusters required at the end.

3. criteria: It is the iteration termination criteria. When this criterion is satisfied, the algorithm iteration stops. Actually, it should be a tuple of 3 parameters. They are `(type, max_iter, epsilon)`:

Type of termination criteria. It has 3 flags as below:

cv.TERM_CRITERIA_EPS** — stop the algorithm iteration if specified accuracy, epsilon, is reached.
cv.TERM_CRITERIA_MAX_ITER** — stop the algorithm after the specified number of iterations, max_iter.
cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER** — stop the iteration when any of the above condition is met.

4. attempts: Flag to specify the number of times the algorithm is executed using different initial labelings. The algorithm returns the labels that yield the best compactness. This compactness is returned as output.

5. flags: This flag used to specify how initial center are taken. Normally two flags are used for this: **cv.KMEANS_PP_CENTERS and cv.KMEANS_RANDOM_CENTERS**.

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)K = 2attempts=10ret,label,center=cv2.kmeans(vectorized,K,None,criteria,attempts,cv2.KMEANS_PP_CENTERS)

Now convert back into unit8.

center = np.uint8(center)

Next, we need to access the labels to regenerate the clustered image.

res = center[label.flatten()]result_image = res.reshape((img.shape))

Output parameters

ret : It is the sum of squared distance from each point to their corresponding centers.
label : This is the label array (same as ‘code’ in previous article) where each element marked ‘0’, ‘1’…..
centers : This is array of centers of clusters.

Now convert color in centers back into unit8.

center = np.uint8(center)

Next, we need to access the labels to regenerate the clustered image.

res = center[label.flatten()]
result_image = res.reshape((img.shape))

result_image is the result of the frame which has undergone k-means clustering. See our result below.

It’s quite pretty!! Now let’s apply this method to other fruits.

Remove background color

As we see, our image now has two colors: fruit color and background color. To simplify these images, I’ll remove background color by converting all pixels having the same RGB value with background color’s into 0.

We can see background color is always the brighter (for this datasets), I’ll assume background color is brighter and remove the brighter color. Code below to find the brighter color.

background_color = max(sum(center_color[0]), sum(center_color[1]))

If there is any pixel in image having the value equal brighter color, I’ll convert it into 0.

for x in range(img.shape[0]):  for y in range(img.shape[1]):    if sum(img[x][y]) == light_color:      img[x][y] = (0, 0, 0)

Let’s see our result.

It looks all fine. With these images after being processed, the next step is convert the clustered image into scalars so as to be able to feed them into a machine learning classifier. To do this, i decided to divide our clustered images onto 3 sections, each section is a color channel (R-G-B). For each channel, I calculate the mean and standard deviation of pixel values. This result in 6 unique scalar values for each image. Then I tabulate into a dataframe to be used as a features in machine learning classifier.

Canny Edge

I also decide to apply Canny Edge filter into our clustered image, as illustrated below. Then break down each 200x200 pixels image into sections of 40x40 pixels. For each of the 25 resulting sections, I also calculate the mean and standard deviation if the pixel values. This resulted in 50 unique scalar values for each image.

So, by calculating the mean and standard deviation in 3 color channel of clustered image and each section of 40x40 pixels in Clustered+Canny Edge image, our dataframe has 56 columns and 1 column for labels. Now let’s build our classifier machine learning model.

Traditional ML: Classification Modeling

The above feature extraction process was repeated for all images in both train and valid datasets. These features were then used in two different classifier algorithm, sklearn.ensemble.RandomForestClassifier and sklearn.svm.SVC. Both methods were optimized for accuracy by passing multiple combination of hyperparameters through sklearn.model_selection.GridSearchCV

We gather these accuracy

As is clear above from the above summary score, our model did quite well.

Summary and limitation

This accuracy score may make you satisfied but you should know that this model just go on the right way with images which have a drum and bright background. So it can’t be used a forest or maybe in a tree. And if our dataset has more than 10 types of fruits (20, 30 or more), there will many types of fruit have the same color or even have the same edge, so absolutely extract color and edge like this article will not work. We can apply more complicated techniques to extract from image or a much better way to approach is deep learning.

Our dataset is simple, that why I want to share it, as an amateur!!

Finally, I want to thanks my team NhatMinh and VoHuyKhoi for their contribution for this work, without them, this project can not be completed!

Thanks for reading and hope you have a great day.