Create a Plants/Fruits Image Classifier using Fastai v1 in Colab

Learn to create a world-class image classifier in a few minutes

Classify different Plants and Fruits’s new 2019 edition of Deep Learning course was just released on Jan 25th, 2019. The new course uses Fastai v1 and PyTorch v1.

Arguably this is the best course for learning Deep learning right now:

  • It’s completely free and taught by the great and humble Jeremy Howard who doesn’t need any introductions.
  • The course is taught in a top-down approach, which makes it easy for people from any background to learn Deep Learning.
  • The greatest benefit from this course is, people can start to apply the state of the art deep learning models into their own domain in just a short amount of time.

So, I just watched the Lesson 1 video in the course. Jeremy claims anyone can create a world-class image classification model in just 4 lines of code. So I decided to give a try to see if it’s true. Although I know Deep Learning, I’m just pretending as if I don’t know anything about deep learning and following the steps taught in the course.

In the first few lessons, Jeremy uses transfer learning to create world-class image classifiers. Resnet-34 and Resnet-50 are the two architectures from which we are going to do transfer learning.

Creating your own Image dataset

I decided to create my own image dataset. There were lots of methods shared in the fastai forums for downloading images. But I found duckgoose to be the most simple method, as it gives you a wrapper over the google-images-download module and it does some sanity checking & automatically organizes the images into train, test & valid folders for you, which then can be directly fed into fastai ImageDataBunch method.

Step 1:

Step 2:

Install duckgoose

!pip install duckgoose

Step 3:

Create your images dataset.

Decide what kind of image classifier you want to develop. Here I'm going to create a Plants species classifier. I chose 5 different categories of plants and 5 different categories of fruits. I’m gonna use duckgoose to download 100 images from google images for each of these 10 categories. In the image_classes dictionary below, the key is the label of the image and value is the search term.

This will download and organize the images into train, valid and test folders. In the above code, the output path mentioned will contain the folders with images. You can now pass the output path to ImageDataBunch.from_folder().

Train the Image Classifier Model

Step 4:

Create the ImageDataBunch

Creating ImageDataBunch

Step 5: View the data.

As you can see below, some images are not related. For example, instead of downloading the bloodberry plant, it has downloaded some other image.

View your images
There will be some wrong images downloaded for each category. Either you can manually remove them or use ImageCleaner method from widgets to clean them. Here I have manually removed a few wrong images.

Step 6:

Train the image classifier using Resnet-34 architecture and fit it for 2 epochs. Then save the model.

learn = create_cnn(data, models.resnet34,      metrics=accuracy,error_rate])
Training Stats
We are getting around ~84% accuracy

Step 7:

Interpret the results.

  • Check which images have been misclassified. You can remove some images which add noise to our dataset. Let’s check the images having top_losses.
interp.plot_top_losses(9, figsize=(15,11))
Images which are having top-losses

Confusion Matrix:

interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
Confusion Matrix

Most Confused Images:

Most misclassified Images

You can see that mango and papaya look similar in some images. So those images are mostly misclassified. Let’s train for some more and try to improve our accuracy.

Step 8:

Unfreezing, fine-tuning, and learning rates

Let’s unfreeze and train the whole model.

Training Accuracy has increased after unfreezing the layers

Wow, our accuracy has increased to 89%. Let’s finetune the model further.

Learning rate finder:

Load the saved model and run the learning rate finder

Learning Rate Finder

Now unfreeze and apply the differentiable learning rate for 2 cycles.

Train with Differentiable Learning Rate
Our Accuracy has not increased much. By just using the default settings of fastai, we got around 89% accuracy. If we remove the noise in our image dataset and do some more tuning, we can surely attain more accuracy.

Final Step: Prediction/Inference

Export the trained model, so that we can use it for inference.

learn = load_learner('/content/plants/')
img = open_image('/content/indian borage1.jpg')

We are going to upload an image of Indian borage and see if it’s predicting correctly.


Hurray, it has predicted the category correctly.

Here is the jupyter notebook for the entire code:

A little trick for opening Notebooks in Colab:
If you're using Google Chrome browser, Download the Open in Colab extension. Whenever you open a jupyter notebook in Github, just click on the open in colab extension, and the notebook will be automatically opened in Colab and you can start executing the cells right away.

In the next article, we will see how to create a web app for the image classifier and productionize the web app in just a few minutes.