Introduction To Deep Learning 🤖— Chapter 2

satyabrata pal

Published in

ML and Automation

10 min readOct 10, 2020

Building A Deep Learning Project From Idea To Production

About The Work Related To This Course

This work is based on the draft of the fastbook -Deep Learning for Coders with fastai and PyTorch and fastai course-v4

👉Read chapter 1 here.

📜Unwritten Rules Of The Practice Of Deep Learning

There are some unwritten rules of the practice of deep-learning. These rules are important as it will guide you to design good systems to solve your problems. So, it becomes important that we understand what these rules are so that we can design good deep learning systems.

👉Never Underestimate the constraints and Overestimate the capabilities of Deep Learning -
Whatever concepts, code, techniques you learn here would work out-of-the-box for many problems but it doesn’t mean that you can get away with the same techniques everywhere.

👉Never Overestimate the constraints and Underestimate the capabilities of Deep Learning —
If you overestimate the constraints like data availability, problem statement etc. then you may end up not applying deep learning to your problem. The key is to try it out.

👉Always Keep An Open Mind —
Keeping an open mind helps you to explore new areas in your problem statement where you can apply deep learning. Now, this might be a tiny part of the problem but if you can identify such an area then it makes it possible to design a process where you can use less data and less complex model than expected to solve that part of the problem.

How To Start Your Project, What To Start & How To Get Data

We are getting close to our first example but before jumping into the code let us try to spare some time to think how to find projects to work on.

I hear these a lot.
👉“Which project to find so that I can practice machine learning”.
👉 “My project/product doesn’t have the scope to implement machine learning”.
👉 “I have an idea to implement Machine learning in my product/work but I don’t have much data available”.

This is what stops many of us to actually implement our knowledge to real world problems. The problems that we are facing at our work, life or in our hobbies.

Many of us wait for the perfect problem statement to come up in our product or the perfect data to be made available to us and we waste months to start implementing machine learning on those problems.

The truth is we have to find our own projects and problems on which we can work on. We need to gather our own data even if it’s less optimal than expected.

The following guidelines will help you seek out problems where you can apply machine learning →
👉 The goal should be not to find the perfect dataset or problem but to just get started and iterate from over there.
👉Iterate from end-to-end of your project. Don’t try to build the perfect model or algorithm for the start and don’t waste months trying to build the perfect model, the perfect UI or the perfect system. Prototype , fail early and then refactor.
👉 Start a project in the domain in which you are already on.
👉For example, you might be a software quality analyst. You would be having lots of test cases written by you or anyone from your team. Try to identify a problem there as you already have data for that domain.
👉You might be working as a developer. Try to identify a problem statement in your domain.
👉You must be having some hobby. This can also be a good place to look out for projects.

The Drivetrain 🚂 Approach

In 2012 Jeremy Howard, along with Margit Zwemer and Mike Loukides, introduced a method called the Drivetrain Approach.

The drivetrain approach can be used to design Machine Learning products.

So, hat is the drive train approach? It’s better if we visualize it.

Drive train approach for machine learning models

The following are the components of this approach →
👉Objective — What outcome are we trying to achieve?
For example I want to build a system which can put my daughter’s photographs and my photographs in separate folders.

👉What inputs we can control— In my case that would be “differentiate between my photographs from those of my daughter’s”.

👉What data we can collect — In my case this can be the different WhatsApp images that I have or the google photos which I have.

👉Models — This is the final step. What data we have and what data we can collect determines what model we can build and how will they perform. Finally these models can be combined to give the final outcome which we need.

So you see in practice machine learning is more than magic. It takes lots of creativity, perseverance and engineering.

Now, in upcoming sections we will combine these principles into code.

Let’s start🚴

First we will do some common setup.

I use kaggle kernels for my deep learning projects. One issue with kaggle kernel is that whenever you install new packages you lose the ability to grab the gpu on that kernel.

Due to this I had to do the following workaround.

Install cuda version 10.1 and then install additional packages.

!pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Data collection📓

For many of your projects, you can find data online. you just need to reach out to those data with proper tools which can help you to download those data.

The project that we are going to do now is a “Waffle” vs “Ice-cream” detector. This system would help us detect which food item is Waffle and which one is Ice-cream.

We will use google images to download our data.

I have outlined the steps to download data from google images in my earlier post, but if you want to get started quickly then I have these images available as a dataset in kaggle.

If you are using kaggle kernel then at the top right corner of your kernel you will find a “Add Data” thing. click on it and paste this link to add the waffles vs ice-cream data to your kaggle kernel.

Moving on🚴‍♂️

You can also see what type of file are there in your dataset by using this👇

In kaggle the input directory, where your data is stored is “../input”.

path=Path("../input")
files = get_image_files(path)
files

Bam!! you get a list of files in your directory and as is evident we have image files which is what we want.

Did you notice something different in the output above ? The list that we got also displays the number of items in the list. This is because this “list” is not the regular python list but a special class L inside fastai which gives the regular python list some superpowers.

With the class L you can do something like this👇

files[1,2]

You see? how you can put in comma separated indices inside the L list and you get back the items at second and third positions.

Okay, let’s move on.

We need something to tell our deep learning model that a particular “name” in an image file means that it’s an image of an ice-cream and some other names means an image is that of a waffle.

How do I do that?

Let me grab a sample from the result of the get_image_files().

files[1]

Notice that the parent folder of this image file has the name “ice-cream”. That’s our holy-grail. This is what tells us which image is what and all we have to do is to grab that parent folder’s name and tell fastai to use it as the label for our model.

The parent_label function in fastai does just that.

parent_label(files[1])

See, how easy it was.

One more thing to remember is that we need to reserve some of our data for testing once our model is trained.

There are many functions in fastai which helps us to split data into test and train sets. One of them is RandomSplitter(valid_pct=0.2, seed=42) .The ‘valid_pct’ here is the percentage of data we want to keep as validation set and the remaining would be kept for training. For now forget about “seed” , we will come to that in later chapters.

One more thing to remember is that neural networks don’t work when images are of different sizes. This is taken care by Resize(128) . It resizes all images into square shape of size given by you. Here I have used 128*128.

Putting It All Together

Now, we don’t have to do all the above steps by hand. We can put it in a bag and give it to fastai to work on. This “bag” is known as Datablock .

At this point it’s too early to deep dive into what datablock does, so the “bag” analogy is better at this moment.

desserts = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128))

How To Make Use Of The Datablock?🤔

The Datablock which we created above is just a bag. It’s more of template which tells fastai what steps to follow.

These steps come to life when we plug in something known as “Dataloader”. This “Dataloader” follows the template (Datablock) to load the data, chops it into little collections known as batches and gives to the GPU.

Just plug the Datablock to the Dataloader and give it a path where you have stored your data.

dataloader = desserts.dataloaders(path)

Remember that dataloader chops the data into batches. These batch of data can be viewed like this👇

dataloader.valid.show_batch(max_n=4, nrows=1)

Did you notice how all the images are of same size?

Do you remember the code item_tfms=Resize(128)which we had earlier put into the Datablock? This takes care of resizing all images into 128*128 square shapes.

Why square shape? Well! it works well enough.

There is one more important thing which we have not done yet.

That’s data augmentation. Neural networks need to have lots of example images to understand how a particular image would look from different angles, under different lighting conditions etc.

It’s not possible for us to have a dataset which would have images of all possible conditions . So we improvise.

This is known as data-augmentation.

Fastai provides a set of data augmentations whereby all images are randomly squished, rotated, skewed etc. to create a variation of example images. All these can be achieved by aug_transform() .

Another key thing is that we need to crop a specific part of the image so that we can have examples of images where the images are cropped from different angles and positions. This is done by RandomResizedCrop which crops images randomly from different portions of an image.

desserts=desserts.new(item_tfms=RandomResizedCrop(128,min_scale=0.3),batch_tfms=aug_transforms())

Did you notice the min_scale keyword inside RandomResizedCrop ? this tells fastai how much of the image should be selected to crop. Another thing to note is the batch_tfms keyword. This tells fastai to apply the aug_transforms to the entire batch.

With these in place we define our dataloader once again.

dataloader = desserts.dataloaders(path)

dataloader.valid.show_batch(max_n=4, nrows=1)

Look! how the images are changed now because of the augmentations done on them.

Time To Train🤩

We have everything ready now to train our model.

The following two lines helps us to do that. Like the previous chapter we will use fine_tune as we are using a pretrained model here.

learn = cnn_learner(dataloader, resnet18, metrics=error_rate)
learn.fine_tune(4)

We have managed to reduce the error rate but around epoch 3th error rate actually increased. So, maybe we should have trained the model for 3 epochs. Also, our model could do much better.

We can check where our model got confused with the following code.

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

As we can see from the confusion metric, 6 images of waffles were predicted as ice_cream and 6 ice_cream images were predicted as waffles.

We can also see which images of the images our model was less confident about.

interp.plot_top_losses(4, nrows=4)

The higher number in the “probability” section means the less confident the model was.

If you ask me about the very first image. My opinion is that it’s a “waffle ice-cream”. Even a human would not be able to classify that into either of the categories.

Such are the situations where you would have to take the advice of a domain expert to know where the first image should belong to(ice-cream or waffle).

Suppose that we have decided that such an image would be declared as a waffle then we will have to re-label it to it’s correct label.

There are many ways to do this but is there an easier way?

That’s exactly what we are going to explore in Chapter 3.

By the way for the prequel to this story read Chapter 1

How To Show Your Support To The Publication🤗

Creating a content requires a lot of research, planning, writing and rewriting . This is important because I want to deliver practical content to you without any fluff.

If you like my work and my content and want to support me then the following are the ways to show your support →

If you like my work then click on this link to Buy me a coffee.
Buy my deep learning course at udemy. Just click on the course link in the show notes and get awesome discount on my deep learning course.
Subscribe to my publication and share it across so that more people can discover it.
Subscribe and share my podcast “SimpleAI” on google podcast or any other podcast player of your choice. Don’t forget to give it a 5 star.
Subscribe to my newsletter.