Fruit Yield Assessment from Photos with Machine-Learning Scikit-image

Li-Ting Liao
Dev Diaries
Published in
7 min readJun 26, 2020
Source: Council of Agriculture, Taiwan

What I’m going to make and Why

I always have an idea about how to help farmers have more efficiency in counting their yields. Isn’t it awesome if they can take a quick glimpse of their orchards and then “ding!” get the answer right away?

With this ambition now I only know maybe I can start by using one photo and train a model to count how many fruits are on a tree for me. This quick application should be helpful to farmers like when they need to plan for hiring workers to collect fruits, or this could be helpful to the government officials that they can collect field pictures to assess a season’s yield of a certain fruit to monitor the potential amount in the market ahead of time.

So which fruit am I trying to use here? Let me give you a hint. Guess which dessert best represents Taiwan?

https://skyticket.com/guide/23192

Other than pearl milk tea or pineapple cakes, “xuehua ice”, which is known as “snowflake ice” or “shaved ice” in English, is a popular local frozen treat and has been listed as one of the top 50 desserts in the world, according to an article here on CNN. Once you get a pile of crumbling ice flakes in your bowl, we can blend it with your favorite fresh fruit such as mangoes along with the syrup, or sweetened condensed milk.

It’s also recommended on our government website as one of the must-eat cuisines if you’re here. Come check it out on this link.

https://www.travel.taipei/en/must-visit/snacks-top10

The answer seems quite obvious now. So I’ll use mango trees for this exercise!

The other reason why I chose mangoes is that first, it’s one of the most important export fruits in Taiwan, and also when it ripens, it turns red. I assume it will somehow help the machine learning model I’m going to use, which detects color pixels, to easily recognize there’s a ripen mango here and there.

Here are a few quick snapshots I made to help you have an idea of the overall picture of Taiwan's export trade in agriculture. Our main exports are crops which include vegetables, fruits, flowers, tea leaves, etc. Fruits are the major part of it:

Data Source: https://agrstat.coa.gov.tw/sdweb/public/trade/TradeCoa.aspx
Data Source: https://agrstat.coa.gov.tw/sdweb/public/trade/TradeCoa.aspx

We can see Mango is the Top 3 export fruit in Taiwan:

Data Source: https://agrstat.coa.gov.tw/sdweb/public/trade/TradeCoa.aspx

So using Mango for this exercise could develop a really good potential in the market. Umm who knows :)

The following steps

  1. Template image selection out of clicks
  2. Template matching with the original picture
  3. Cluster analysis for fruit counting

So, let’s get started.

Programming Process

Import required libraries:

Import image I’m going to use and convert it into color pixels arrays. I found this picture from a person’s blog:

https://sch0916.pixnet.net/blog/post/38998587

I was really curious about what this picture looks like as I knew some pictures will be 2-D or 3-D arrays. Used the following codes to check it out:

What I found out is that there are 3 layers i.e. 3 dimensions of my chosen picture. The X-axis is in the 1st layer, Y-axis is in the 2nd. For the 3rd layer, I will be using the first item only as using the other 2 items seems indifferent versus using just the first item. This structure will be more important in the latter part of the code.

Then we start to click on the reddish ripen mangoes on the imported picture. When we click on the picture, our click’s x-coordinates and y-coordinates will be stored in a list:

The click event uses matplotlib event connections to create a function that stores points clicked on the image. Once clicked, coordinates are added to the list:

The clicked points will show their details on the top left corner of the subplot:

Let’s say I have clicked on 6 mangoes. The corresponding coordinates can be printed like below and the length of the clicked list (puntosinteres) is 6:

Then if I want to click on more mangoes, i.e. adding upon what I just clicked. The code snippet looks like this:

From the above code, I got the clicked ones shown as red dots on the picture. Now I can click on other mangoes other than those clicked 6 dotted ones:

The next step is to take a closer look at my selected templates i.e. clicked mangoes pictures. This is to make sure I click on the right things:

Since I clicked on 3 more, so now I have 9 template pictures:

Once I double-checked all the templates, if I want to delete one of the pictures, I can use the below code. Also, check the length of the clicked list (puntosinteres) again:

Here comes the most exciting part. Start to train our model with our template and original pictures using color pixel arrays!

Here I’m using sklearn’s match_template library. I noticed that before putting into match_template function, the template picture’s x-axis and y-axis coordinates need to be reversed (the similar thing I saw from people who are using OpenCV library):

result variable will give us a matrix of all normalized cross-correlation between the template and original picture, showing how well the template matched to different locations in the original image; then we set a threshold as 0.8. Only those matched higher than this score will be appended to our result list (listaresultados). Eventually, our result list (listaresultados) shows arrays of color pixels that are closest to our clicks on the original picture.

  • Take a special note that when we need those coordinates from the outcome, we need to reverse it back again to get the right coordinates for both axes.

Let’s plot the mangoes coordinates from our machine learning model directly on the original picture:

Then we can see from below that several pixels from our result list (listaresultados) are actually from the same mango, which could lead to double counting:

So it’s why we need a cluster analysis here to help us solve this issue.

What is Birch — in sklearn.cluster library there’s a BIRCH method, which is a memory-efficient clustering method for large datasets. Its algorithm creates Clustering Features (CF) Tree for a given dataset and CF contains the number of sub-clusters that hold only a necessary part of the data. Thus the method does not require memorizing the entire dataset.

The usage process includes defining the birch method and fitting it with our input data.

Converting result list (listaresultados) into data list to fit in birch model later on:

The most important arguments of the birch model are:

1. threshold: sets the limit between the sample and sub-cluster.

So we create a birch model and fit data list into it:

Then we will see the model now can identify 15 mangoes as ripen ones.

  • Watch out for the birch models’ threshold argument — I notice that when its number becomes smaller, the model will split its cluster more, and this will make it more likely to have multiple clusters at nearby areas i.e. redundantly mark over the same mangoes.

And it seems that we only use 9 templates to make the model able to recognize 6 more ripen fruits.

We see there are still some limitations of this procedure such as there are still many more mangoes left unidentified. Also, I believe there’s a way to give the model to several mangoes’ pictures, and then we take those as templates and ask the model to identify how many mangoes are on the tree we have above.

Other helpful materials I’ve used for this article:

Also, this article is my first baby step in using machine learning in agriculture. I was fully inspired by this wonderful material:

That’s it. Thanks for reading!

--

--

Li-Ting Liao
Dev Diaries

Software developer by day, amateur writer by night. Passionate about both code and creativity, and always seeking new ways to learn and grow.