Faster AI: Lesson 1 — TL;DR version of Fast.ai Part 1

Kshitiz Rimal
Deep Learning Journal
7 min readAug 24, 2017

--

If you haven’t read Lesson 0 of this blog series, please go through that first. There I have covered why I am writing this series and what to expect from it and some necessary overview of this Fast.ai Part 1 course.

This Lesson is divided into 3 parts:

  1. Software and Hardware Setup [Time: 10:51]
  2. Model we are going to build and Data it uses [Time: 52:57]
  3. Pretrained Model and concept of Finetuning [Time: 1:09:23]

1. Software and Hardware setup

In order to run any deep learning application you would need two things at first, programming languages and frameworks using which the application is written and supporting hardware environment.

Lets start with the software part first. In this course you will need to install Python language on your system, Numpy, a python package for matrix computations, Keras, a deep learning library for python programming language. In order to install them, you can open the video from that particular time from above. Jeremy recommends installing them via anaconda software package which will automatically install libraries like numpy and a special software called Jupyter notebook which this course extensively uses to teach throughout the course. Its basically an interactive notebook for python language, where you can write documentation of your code and execute the actual code at the same time and save the result for future reference.

Except these python libraries and software packages, this course also uses two separate python scripts, utils.py and vgg16.py. They are provided to you on the github repo of the course. For the files such as pre-trained model files and weights, use this link. I suggest you to download all the files on that repo including some excel files, as Jeremy uses those files to explain most of the ideas of deep learning.

Here is a diagram Jeremy drew while explaining how all these packages and software interact with each other

It basically shows that the separate Vgg script leverages Keras framework underneath. Keras itself uses Theano, another deep learning framework as its backend for the computations and ultimately Theano leverages the power of Graphic Processing Units (GPUs) by using CUDA language provided by NVIDIA graphic cards. As of this moment, only Nvidia cards with CUDA capabilities are used for deep learning purposes.

With software installed and all the necessary files downloaded, lets move on to the hardware part.

It is recommended that you use a system with GPU access for deep learning. With that being said, its not entirely a necessity if you have small dataset, for example few number of images. You can train the model with that data on CPU as well but, it will take some time depending on your CPU and RAM.

In this course Jeremy is using AWS P2 instances with GPU access for training the model. There is entirely a separate video on the playlist to teach you how to setup AWS instances for this course.

Personally I tested the accuracy of the model using only the CPU of my laptop but i did so with few sample data. If you cannot get hold of AWS instances there might be many alternatives on the internet but one I personally find very easy and effective is Floydhub cloud GPU. Another great aspect of Floydhub is, its free. Free upto 2 hours of GPU time without you having to enter your credit card info. So, for the purpose of learning using GPU and follow along with this course you can totally use it and get the best result from your model leveraging full GPU power.

Start with its Quick Start Guide
http://docs.floydhub.com/getstarted/quick_start/

And move to the section to set up jupyter notebook on it
http://docs.floydhub.com/getstarted/quick_start_jupyter/

Many environments are supported by Floydhub, what we need for this course is the one with Keras with Theano as a backend. You can check all the available environments and how to use them from here
http://docs.floydhub.com/guides/environments/

Now as the hardware and software portion is covered. Lets move on to another section.

2. Model we are going to build and Data it uses

For this lesson and future lessons to come, we will be using Kaggle’s Dogs Vs Cats Dataset to classify images. You can simply download the data from the website and open up Lesson 1.ipynb file on Jupyter notebook and follow along the course to train the model to classify between dogs and cats. Make sure that the data is present in the same directory of the notebook files.

If you are going to use Floydhub, you wont be needing to download the data set as well. As many of the users on Floydhub has already uploaded that data and you can simply use them by remotely mounting them on your project.

Here is the one with the same dataset on Floydhub
https://www.floydhub.com/rarce/datasets/dogsvscats/1

You can follow this guide to learn how to mount the remote dataset in your project
http://docs.floydhub.com/guides/data/mounting_data/

One thing you need to make sure is that, the train and valid directory of your dataset need to have two sub folders called Cats and Dogs present in it with respective images. This is how Keras determines the categories of the model.

Now the software is ready, hardware is ready, Data is ready. Lets move on to the real part.

3. Pre-trained Model and concept of Fine tuning

Generally in deep learning we have some data and we have a model which uses some algorithm to learn from these data and with that it gives the result that we want.

Pre-trained Models are the models which are already trained on millions of data and are much robust in nature. These models when again trained with the little of our own custom data, gives us more accurate result we want. There are many pre-trained models available on the internet, but depending on our need, we need to select an appropriate one.

In our case we have data of Cats and Dogs. Now to enhance the accuracy of our model we use pretrained model called VGG16.

VGG16 is a convolutional neural network model, which is trained on Imagenet Dataset. In general convolutional type neural networks are used when we have to deal with images.

As this model is trained on Imagenet Dataset, which has thousands of images of not just cats and dogs but other natural objects as well. By utilizing this, we are enhancing the outcome that we would normally get by just using our little data.

In short, with this approach our model will recognize images of cats and dogs more intelligibly than with only using our own data.

But the problem with this particular pre-trained model is that, it gives the classification output of 1000 categories, which is what the Imagenet has and which is what this model is build for originally.

To make this model to just give the classification of two categories, Cats and Dogs, we use the concept of FineTuning. What Finetuning does is, it train the pre-trained model using our new data and instead of giving output of 1000 categories, it uses the categories of our data and just classify between them. In our case we have 2 categories, so the result or output will be prediction between whether the given image is a cat or a dog. And this whole process can be done using this seven lines of code.

Here, what its doing is, its defining the batch size, which means, how many images we would want to fed to the model at one go, in this case 64.

Generally in deep learning we don’t feed images to the model one by one. We do so in a group or in a batch. That way, the GPU (Graphical Processing Unit) will be fully utilized and model will work faster.

Then it initializes the model with the VGG model. Then we separate batches of our training set and validation set.

We need some data from our dataset to be used for actual training the model and some for testing its accuracy on this training.

Here training set is what the model is trained upon and validation set is used to check how properly the model is training on that training data. Then we fine tune them according to our dataset and finally we fit them.

With only this 7 lines of code our model can classify whether its cat or dog by above 90% accuracy.

You can learn a great deal by just reading the Lesson notes as well.

You can also check the video timeline and jump to any particular topic on that video.

On our next lesson we will dig deeper into these ideas and experiment with it even more as we move along. See you there.

Next Post : Lesson 2

--

--

Kshitiz Rimal
Deep Learning Journal

AI Developer, Google Developers Expert (GDE) on ML, Intel AI Student Ambassador, Co-founder @ AI for Development: ainepal.org, City AI Ambassador: Kathmandu