Train a Choripan Classifier with Fast.ai v1 in Google Colab

This quick tutorial will use information available throughout the Fast.ai Forums, Docs, and GitHub, to give you an overview of how to train your own classifier with a GPU for free in Google Colab. Next will be how to serve this PyTorch model in production.

Set Google Colab Notebook

Go to the Google Colaboratory Research website and create a new Notebook. Once inside, click Edit > Notebook Settings > Runtype Type: Python 3 and Hardware Acceleration: GPU.

Install PyTorch/Fastai

The following commands will install torch_nightly and fastai.

# http://pytorch.org/
from os.path import exists
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'
!pip install torch_nightly -f https://download.pytorch.org/whl/nightly/{accelerator}/torch_nightly.html

import torch
!pip install torchvision
!pip install fastai

!pip install Pillow==4.1.1
!pip install image

Once that’s done, we check the PyTorch version to see if it was installed correctly.

print(torch.__version__)
print(torch.cuda.is_available())
print(torch.backends.cudnn.enabled)
1.0.0.dev20181107
True
True

Import Libraries

import fastai
from fastai import *
from fastai.vision import *

torch.backends.cudnn.benchmark = True

Load Data

I have prepared a dataset that contains 330 images with the label “Choripan” (Argentine Hot Dog) downloaded from Google Images (using this library) and another 330 images with the label “Not Choripan” that are taken from this Kaggle dataset. They are split 60/40 with 200 images for each class in the train set and 130 images for each class in the valid set. This is the folder structure

data
|--- train
| |--- Choripan | --- *.jpg
| |--- Not Choripan | --- *.jpg
|--- valid
|--- Choripan | --- *.jpg
|--- Not Choripan | --- *.jpg

Download the dataset and unzip in a new data directory

import os, zipfile

filename = 'choripan-not-choripan.zip' # Name of dataset
base_url = 'https://s3.amazonaws.com/nicolas-dataset'
url = os.path.join(base_url, filename)
output_file = os.path.join(os.getcwd(),'data', filename)

!wget -nc $url -P 'data'

with zipfile.ZipFile(output_file,"r") as zip_ref:
zip_ref.extractall('data')

!rm -R $output_file

Let’s set a path to our dataset and take a look at what’s inside

path = Path('data/choripan-not-choripan')
path.ls()
[PosixPath('data/train'), PosixPath('data/valid')]

Fastai provides two helper functions to create a DataBunch object that you can directly use for training a classifier. There are a number of ways to create an ImageDataBunch. One common approach is to use Imagenet-style folders with ImageDataBunch.from_folder. Because there is a shared memory issue with Google Colab, we need to set size = 224 and num_workers = 0 in order to train our model.

tfms = get_transforms(do_flip=True)
data = ImageDataBunch.from_folder(
path,
ds_tfms=tfms,
size=224,
num_workers=0
)

Here the datasets will be automatically created in the structure of Imagenet-style folders. The parameters specified:

  • the transforms to apply to the images in ds_tfms
  • the target size of our pictures in pixels

We can also take a look at images inside the batch by using the following, where the rows argument is the number of rows and columns to display.

data.show_batch(rows=3, figsize=(8,8))

You can read about the other approach in detail in the Fast.ai Documentation among many other things. Finally, take a look at the data classes

print(data.classes)
len(data.classes),data.c
['Choripan', 'Not Choripan']
(2, 2)

Train the Model

Transfer learning is a technique where you use a model trained on a very large dataset (usually ImageNet in computer vision) and then adapt it to your own dataset. The idea is that it has learned to recognize many features on all of this data, and that you will benefit from this knowledge, especially if your dataset is small, compared to starting from a randomly initialized model. It has been proved in this article on a wide range of tasks that transfer learning nearly always give better results.

In practice, you need to change the last part of your model to be adapted to your own number of classes. Most convolutional models end with a few linear layers (a part will call head). The last convolutional layer will have analyzed features in the image that went through the model, and the job of the head is to convert those in predictions for each of our classes. In transfer learning, we will keep all the convolutional layers (called the body or the backbone of the model) with their weights pretrained on ImageNet but will define a new head initialized randomly.

Then we will train the model we obtain in two phases: first we freeze the body weights and only train the head (to convert those analyzed features into predictions for our own data), then we unfreeze the layers of the backbone (gradually if necessary) and fine-tune the whole model (possibly using differential learning rates).

The create_cnn factory method helps you to automatically get a pretrained model from a given architecture with a custom head that is suitable for your data. With fit_one_cycle we will train the last layers of our Neural Network.

learner = create_cnn(data, models.resnet34, metrics=[accuracy])
learner.fit_one_cycle(1,1e-3)
Total time: 00:52
epoch train_loss valid_loss accuracy
1 0.740443 0.613328 0.617761 (00:52)

And save the model for later

learner.save('stage1')

You can read more about what each parameter does in the documentation and have a look at the Lesson 1 of the 2019 Fast.ai Course where they explain what things to look at in order to understand if the model is performing correctly or not.

Unfreezing, fine-tuning, and learning rates

At this point, only your last layers have been trained, so now we should train all the layers together of the new NN. We’ll do that by unfreezing our model with learner.unfreeze() and continue the training with

learner.fit_one_cycle(1)

Now we’ll load the model we saved in the last step and find the right lr or learning rate. This may take a few minutes as it needs to run several epochs.

learner.lr_find()

And we plot the results with the following command

learner.recorder.plot()

You can see that once we reach a lr = 1e-2 the loss starts to increase. We will choose a range between 1e-6 to 1e-3 and range for another 5 epochs.

learner.unfreeze()
learner.fit_one_cycle(5, max_lr=slice(1e-6,1e-3))

We can see that the valid_loss is much lower than the train_loss which may indicate that we have to increase our validation set, but overall this is a great result and may work well with real-world data.

Total time: 04:36
epoch train_loss valid_loss accuracy
1 0.336365 0.456992 0.779923 (00:48)
2 0.295529 0.259180 0.888031 (00:58)
3 0.234426 0.144196 0.934363 (01:00)
4 0.193410 0.113248 0.957529 (00:50)
5 0.164487 0.109454 0.965251 (00:59)

Run Inference on a Single Image

Once you have finished training your model, you may use the following method to predict a single image

img = open_image(learner.data.train_ds.x[5])
learner.predict(img)

(Optional) Re-train with ResNet50

data = ImageDataBunch.from_folder(path_img, fnames, pat, ds_tfms=get_transforms(),
size=299, bs=bs//2).normalize(imagenet_stats)
learn = create_cnn(data, models.resnet50, metrics=[accuracy])
learn.fit_one_cycle(8)
learn.save('stage1-resnet50')
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6, 1e-3))
learn.load('stage1-resnet50')