Image Classifier using FastAI and Google Colab

swapp19902
10 min readJan 28, 2019

This is a blog post to record my progress on the free deep learning course by Jeremy Howard. In this blog post we will see how to classify if an image is of a Rabbit or a Racoon. I will show how to create a standard image dataset, use google drive along with google colab, use fastai library to create a CNN network using a pretrained ResNet model and finally predict if the following image is a Racoon or a Rabbit?

What is a CNN?

CNN is a Convolutional Neural Network which learns to create filters on the input image, which can then effectively classify the images based on what filters it passes activating those specific neurons inside each layer. Earlier layers generally create filters which are able to detect edges and geometric shapes while the later deeper layers have filters which identify more details like faces, eyes etc.

Setup

  1. Follow this setup instructions to get started using Google Colab: https://course.fast.ai/start_colab.html
  2. Create a new Python 3 notebook named “IC1”. Install fastai libraries. !curl -s https://course.fast.ai/setup/colab | bash
  3. Add a new Code line and run the following:
from fastai import *
from fastai.vision import *
import numpy as np

If you can run without error, the fastai lib has installed successfully.

4. Go to Runtime in the menu, click “Change Runtime Type”. Choose “hardware accelerator” as “GPU” if not already set. This will run the computations on GPU which is much faster.

Why Google Colab?

While starting to learn ML algos, I have used various tools and platforms like running locally on Mac using Jupyter notebooks, use AWS SageMaker and Crestle. But Google Colab works pretty well if you are a beginner, who just wants to start learning without spending too much time on setup. Also, it gives a free 1 GPU server, I haven’t checked how much I can use it for. AWS also has GPU, but it is really expensive and you have to remember to start and stop instance or expect a huge bill. Running locally is an option, but it’s important to have a CUDA GPU if you want faster performance. From the 1 week I have used Google Colab, it works exactly like Jupyter but a lot of functionalities are missing from Jupyter, which I hope they add soon.

Upload Image Dataset to Google Drive

You can upload images to your google drive and mount the drive to Colab. To download images I used google_images_download, which is a python script that downloads images based on arguments locally. To do this, open a local python instance (I use a local jupyter notebook).

  1. Install google_images_download
!pip install google_images_download

2. Then in arguments give the search term “rabbit” and “racoon” and the count of images to download.

from google_images_download import google_images_downloadresponse = google_images_download.googleimagesdownload()arguments = {"keywords":"Rabbit,Racoon","limit":60,"print_urls":False}paths = response.download(arguments)print(paths)

This will download 60 images with 2 folders “Rabbit” and “Racoon” to the dir “jupyter_root/downloads”

3. The standard format for fastai is creating two folders: “train” and “valid”. Inside this folders create folders for the classes, in our case “rabbit” and “racoon”. I created similar folders in google drive under “fastai/IC1”.

fastai
!-- IC1 (Notebook name)
!-- train
!-- rabbit (47 Images)
!-- racoon (47 Images)
!-- valid
!-- rabbit (10 Images)
!-- racoon (10 Images)

4. Check the downloaded images locally, and see if there are some broken images and remove them. Then choose 10 images and add it to valid and the rest to train dataset.

The reason to use 2 datasets is that the train dataset is used to modify the weights of the CNN to become good at predicting the correct class by minimizing the loss using Stochastic Gradient Descent/backpropogation which is the backbone of a neural network. And then using the valid set to check if the network is not overfitting to the trained dataset. So, the idea is the valid dataset is a fresh new data which is the network has never seen before and still is able to predict the class accurately.

Connect Google Drive to Google Colab notebook

from google.colab import drive
drive.mount('/content/drive/')

After running this command, it will give a link. It directs to a new page, where you will authenticate your google drive and in return get a big token string. Copy it inside the notebook console, and you have successfully connected your google drive, and can access any files you upload in it.

FastAI approach to Image Classifier

FastAI is a library which gives a pretty fast interface to basic Machine Learning tasks like Image Classifier with a very few lines of code and State of the art (SOTA) performance.

  1. Create data which is ready to be passed to a CNN
tfms = get_transforms(do_flip=False)
path = '/content/drive/My Drive/fastai/IC1'
data = ImageDataBunch.from_folder(path,ds_tfms=tfms, size=100)

path — the path to the image dataset, make sure it has train & valid folders with images inside it in specific class folders (rabbit & racoon)

ImageDataBunch- Is a helper class for doing ML tasks on a dataset of images easily. It has various helper functions which create PyTorch based objects. I will go more in detail over this in next post. For our example, we will use:

ImageDataBunch.from_folder(path=?,ds_tfms=?,size=?)

from_folder — works if the dataset is categorized using folders. There are other ways to categorize the images and ImageDataBunch supports most of them, but this is the best approach.

tfms — Transforms provides simple image augmentation which will do basic transforms on the images, to increase the number of images the network receives for training. This transforms do tasks like translation, rotation, color modifiers etc. In get_transforms you can specify additional arguments to do specefic transforms or prevent it from doing some kind of it. Eg. do_flip=False would prevent it from flipping the image, as in our case we will never have images of animals flipped, so no point in training the network with flipped images. More augmentations would prevent overfitting.

size — Size=100, converts or crops each image to a 100 pixels width & height. CNN works best if all the images are of equal size. Cropping an image may remove some important features in an image like Face of the animal as it does center cropping, so we have to be careful with the size. Make sure the images downloaded have the objects you want to classify at the center of the image. A bigger size will allow network to see more details, but it takes more time/computation to train.

2. View some data to make sure the images are loaded/processed correctly

data.show_batch(rows=3, figsize=(10,10))

3. Train the data using CNN

my_trained_mod = create_cnn(data, models.resnet34, metrics=error_rate)
my_trained_mod.fit_one_cycle(6)

This 2 lines of code are basically the crux of the Image Classifier. FastAI makes it easy to create a CNN using create_cnn. We pass in the data that we created earlier using ImageDataBunch. We also pass in a pretrained model (models.resnet34). my_trained_mod is the final trained model, which can be used to predict new images class.

Why Pretrained Model?

Pretrained models are models or basically network weights which have been trained earlier. Eg., Resnet34 is a 34 layer CNN, which has already been trained on ImageNet dataset. ImageNet dataset is a dataset with millions of images over 1000s of categories/classes. Using this pretrained model saves a lot of time and computation, as doing a fresh training with just 50 images of animals will not give a good result. But a network trained on millions of images already has learnt a lot about the world since it has been trained on animals, objects in the real world. Pretrained models use the concept of Transfer Learning where the earlier layers are not trained and the CNN only trains the last or two layers, so that it works fairly accurately with only a few images.

FastAI allows a lot of pretrained models you can use, but ResNet is the most SOTA model currently, which works for a lot of classifiers.

4. Once you run the above code, the CNN runs for 6 epochs, and you can see the error_rate output

First FastAI downloads the resnet34 model and saves it to a temp folder. It will use the weights from this model, while training.

We can see the error_rate ie. the percentage of images it got wrong is decreased each epoch from 45% to 35%. That means, if you upload an image of Racoon/Rabbit it can correctly classify it 35% of the time after the training for 6 epochs is done. Here the weights are modified for just the last layer from the original resnet model. It took less than 1 minute to perform this training. If were to train all the layers from the beginning it might take hours, but since the training dataset is really small, it will perform really poor.

Predict the class of a New Image

Now let’s see if we can download a new image from net and check if the model predicts it accurately. To do this search for an image of rabbit and put it inside the “IC1/custom” folder.

filename = '/content/drive/My Drive/fastai/IC1/custom/rab1.jpg'
img = open_image(filename)
img

You should see the img output. Now pass this img to our trained model.

pred = my_trained_mod.predict(img)
print(pred)
```(Category rabbit, tensor(0), tensor([0.7942, 0.2058]))```

my_trained_model.predict takes the input image and predicts what class it is. It also gives the probability for all the classes. tensor([0.7942, 0.2058]). In our example, it gives 79% to Rabbit and 21% to Racoon, which is not that high, but it works. If there were more classes, the probs would be for all of them which will add up to 1.

Improving the performance of the trained model

Cute Rabbit

35% is not a good error_rate and can be improved. In my case, it predicted this image as a “rabbit” when it clearly is a “racoon”.

There are several ways you can improve the training:

  1. Give a more diverse variety of training images and increase the count of the dataset.
  2. Train more layers instead of just the last layer.
  3. Play with the learning rate for each layer instead of using the default learning rate.

All this improvements will add cost to your training, so we need to try them as we see fit. FastAI provides a way to check the learning rate.

my_trained_mod.lr_find()
my_trained_mod.recorder.plot()

This will record the losses for each epoch based on different learning rates and plot it for us:

As we can see the learning rate decreases the train losses as it is decreased. Also, the losses get worse as the learning rate is decreased less then 0.01. So, the best learning rate for our model is 0.01. This changes based on the model and the dataset.

What is Learning Rate?

Learning Rate is the rate at which the backpropogation algo adjusts weights to minimize the loss. If learning rate is higher, it changes the weights rapidly so the local minimum is reached at a faster rate. This improves the speed of training, but the losses are greater. For our model, learning rate of 0.06 has a training loss of 0.18 and learning rate of 0.01 has a loss of 0.10. So, clearly a smaller training loss works well but it would significantly increase the training time.

Also, since we are training only the last few layers, we can have a higher learning rate for earlier layers and lower for the last layers. We can do this easily using fastAI as follows:

my_trained_mod.fit_one_cycle(3, max_lr=slice(1e-6,1e-1))

This would train our model for 3 epochs, but the first layer will have a learning rate of 0.06 and last layer will have 0.01, and the middle layers will be distributed across this values. This small step significantly improves the performance as we see below:

It took only 18 sec to train 3 epochs, and the error rate has been improved to 5%. It trained much faster as we increased the learning rate to 0.06 for our earlier layers. the default initially was 0.03. And as the learning rate is lowered, it minimizes the loss in a much better way.

So, now if I predict my wrong “racoon” image with this new and improved trained model, it predicts the class correctly. Yay!

Finally, this is the prediction for my original image of Rocket Racoon:

It predicted the image as a “racoon” with 99.5% probs. I would say that is pretty good. Take that, Thor!

Next Steps:

  1. Continue the lecture series which is going to go in-depth inside the fastai library and also create CNN’s from scratch, and hopefully I will find time to write more blogposts to record my learnings.
  2. As I have experience as a frontend dev and AWS, I am planning to create a website which will allow for anyone to create there own classifier without using any code using AWS SageMaker and API endpoints.
  3. Also, create a script to create own image datasets directly from the browser.
  4. Create image classifier with more classes and use Interpretability to see what’s going on inside the layers of CNN.

--

--