Create A Neural Network in 11 minutes

A Step-By-Step Guide Using Fastai

Published in

ML and Automation

11 min readNov 11, 2019

An average Indian produces around 450gms of trash on an average per day. The real problem is not the amount of trash produced. The problem is waste segregation.

10–15mins of my daily time goes into thinking about separating out the trash into different bins. This time could otherwise be spent in a better way, like taking a quick 15mins powernap :) .

Wouldn’t it be nice if someone else could do this for us everyday and in a much more efficient way than we currently do?

Well! it seems we can teach a machine to do that. Luckily a dataset is out there at Kaggle which has lots of examples about different kinds of trash types. Shashank Sekar created this dataset and it is available here at kaggle.

Let’s get started

Don’t get overwhelmed by the code below. These lines of code make sure that external changes gets auto loaded in this notebook.

More about this command can be read here.

%reload_ext autoreload
%autoreload 2
%matplotlib inline

The next lines of code are the customary rituals to search for the “fastai” library on your machine and then bring back all the functions written in modules named as “vision” and “metric”.

The details about “importing” in python is explained here.

Oh! Did I mention that you would be needing to install the fastai V1 library which sits on top of Pytorch 1.0 .

The fastai library provides many useful functions that enables you to build neural networks and train our models

from pathlib import Path
from fastai.vision import *
from fastai.metrics import error_rate

The Need For Power

Before we proceed any further, let me tell you that need raw GPU power because this deep learning stuff is power hungry and without a GPU you can spend an eternity waiting for the algorithm training to get finished.

RAW Power Is Required For Faster Training

If you don’t have a GPU then fastai has a very well documented steps here or here to help you out with this. You can even run a version of this post which I have hosted here at Kaggle.

Ok! you got your GPU, you got the link to the dataset. Now you need to set the “batch size”. Wait, What? what’s a batch size?

Well it goes this way. To start training a deep neural network you need to feed it some images as examples of the problem that you are trying to solve.

While doing so you need to be cautious that you do not feed too many images to you deep neural network since all those images will eat up the RAM in your computer and then it will crash.

To prevent this you can feed the images in small batches in regular intervals and the size of these batches is “batch Size”.

Fastai gives you a default batch size of 64, which works in most of the scenarios.

How To See The Data

To see the data first you need to fetch it from the site where it is hosted. If you are trying out this notebook in your local machine or in a cloud server other than Kaggle then uncomment the following lines by removing the `#` in the following line.

#path = untar_data('https://www.kaggle.com/techsash/waste-classification-data', <destination>)

When you uncomment the previous code and run it, the untar_data() method will download the dataset from the url given by you and then it will unzip the file to the destination.

The variable `path` will give you the representation of the file system where your data is downloaded.

You can learn more about the untar_data() method here.

Replace the “<destination>” with the path where you want your downloaded data to be saved.

If you happen to run this notebook on kaggle then you can download the data by following the steps which listed in this article

After this you can run the code below to wrap the “../input” path in an inbuilt method in python known as Path() which gives you the representation of the file-system where your data is saved.

Curious souls can refer to the python documentation here to satisfy their craving to know more about the Path() method.

path = Path('../input/waste-classification-data/dataset/DATASET');

If you are using kaggle then the path to this dataset would be something like the path inside the Path() method. If you are using your local machine or a GPU server then replace the path inside the Path() method with the root folder of the dataset.

The following two variables pathTest & pathTrain contains the train & test subfolders which are inside this particular data set. When you do pathTest.ls(), it displays the folders and subfolders in that path.

Curious souls can refer the python documentation here to please their craving to know more about the `Path()` method.

pathTest = path/'TEST'
pathTrain = path/'TRAIN'
pathTest.ls()------
Output
------
[PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O'), PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/R')]

In the output above You can see that their are subfolders named ‘O’ & ‘R’ inside the folder “TEST”. What is inside these folders? The following code can give you the answer.

numberOfFiles = (pathTest/'O').ls()
numberOfFiles[:5]------
Output
------
[PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O/O_13022.jpg'),
 PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O/O_12670.jpg'),
 PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O/O_13838.jpg'),
 PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O/O_13346.jpg'),
 PosixPath('../input/waste-classification-data/dataset/DATASET/TEST/O/O_13446.jpg')]

(pathTest/’O’).ls() means “please give me the files or folders inside the given path” and the numberOfFiles[:5] means “please give me the first 5 items in the list” .

The [:5] is known as “slicing” and more information about this concept can be read in this article at pythonforbeginners.com.

The output in the previous section shows that image files are inside these sub-folders and here “O” means”organic waste” and “R” means “Recyclable waste”.

How To Get The Data

The names of the folders “O” and “R” are important because these are what you may call as “labels”. Looking at these labels you can know that which folders contains which type of waste’s images.

A neural network operates in a similar way, by looking at these labels it can learn to recognize which image belongs to which label.

You need to extract these “label” information and then make it usable for the neural networks that you would be creating shorty.

Besides extracting the label information, you also need to do the following things →

Split the data into training and validation sets — A machine learning algorithm needs a training set on which it can learn how to categorize data and then a validation set on which you can test whether the trained algorithm is behaving as expected.
Transform the data — If you train a machine learning algorithm only on similar looking images then it will fail to recognize images which deviate from the original training examples which you had shown to it during the training cycle. I have also explained this in one of my previous article.
Convert the data into an “ImageDataBunch” — An ÏmageDataBunch is a python object which collates your data, does the above two things then normalizes them and then drops into onto the GPU. A detailed explanation is here in the fastai’s official documentation.

All these things must seem overwhelming to you but fastai provides inbuilt methods for all these things.

Look the code below which does all the things listed above in a single line.

data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms(), valid_pct=0.2, size=224).normalize(imagenet_stats)

In the code above, does the following things →

ImageDataBunch .from_folder() method is used to Tell fastai the path to the data.
get_transforms() flips, rotates, crops data.
` is the percentage of training data which is reserved for validation.
size is the size to which the image files are converted before feeding to he neural network
Finally the data is normalized using normalize()

You can view your batch of images using the following code.

data.show_batch(rows=3, figsize=(7,6))

.classes let’s you to view the classes i.e. the labels in your dataset.
.c let’s you view the number of classes.

print(data.classes)
len(data.classes),data.c------
Output
------
['O', 'R']
(2, 2)

It’s Training Time

Let’s see what you have got till now →

You know where the file are.
You now have your databunch
You know how many classes are there in your data

Now, you have got everything to train your neural network.

Training a neural network involves the following steps →

Select an architecture for the neural network
Write code for the neural network
Select a metric for measuring the accuracy of the algorithm
Feed the data to the algorithm
Run the neural network through each image multiple times.

That looks like a ton of coding, but fastai let’s you do all these things using a single line of code as mentioned below.

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

The cnn_learner() takes in the following as input →

data — This is the databunch created in the previous sections
models.resnet34 — This is a pretrained model. A pretrained model is a neural network which was already trained on similar data. Using pre-trained model is known as transfer learning and it greatly reduces the training time. Here is good article at towardsdatascience.com which gives an introduction to transfer learning using fastai.
metrics — You will need to assess the accuracy of the neural network after the training is done. The “metric” will help you with that.

Once the neural network is created, the fit_one_cycle() runs the network through the sample images 4 times. This helps the network to learn the details about the images and tries to distinguish between different categories of those images. Which in this case are ‘O’ and ‘R’.

learn.fit_one_cycle(4)

In the output above, you can see some numbers . As of now don’t pull your hair out while thinking what those numbers are. It’s enough to know that this is an error metric and generally these numbers appear in decreasing order with each cycle.

Finally save your model using the save() method.

learn.model_dir = "/kernel/output"
learn.save('stage-1', return_path=True)

It’s Result Time

The ClassificationInterpretation class in the fastai library fetches the predicted images, compares those with the actual images and returns the most incorrect images.

These are the images which the network was confused about.

interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()
len(data.valid_ds)==len(losses)==len(idxs)
interp.plot_top_losses(9, figsize=(15,11))
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)
interp.most_confused(min_val=2)

The plot_top_losses() is used to plot the top losses i.e the top images which the network failed to classify.

The plot_confusion_matrix() plots the matrix of predictions where the network made an error.

The most_confused() gives the number of classes where the network got confused.

Train Some More

As evident from the confusion metric, the network got confused about some of the images.

For example, if you look at the second image of the first row in the previous section then it’s not clear if that “thing” is recyclable or not.

Now, try to train the model some more and see if that improves the accuracy. To do this use the `Unfreeze()` method which allows you to train the backbone of the network.

Fit one cycle and then look at the error metric to assess the training result.

learn.unfreeze()
learn.fit_one_cycle(1)

The loss has reduced greatly now. Now, load the model using .load()and then find the learning rate using lr_find().

learn.load('stage-1');
learn.lr_find()
learn.recorder.plot()

The `.recorder.plot()` plots the learning rate. The plot above shows the rate at which the network learns to reduce the loss.

Train some more layers by first “unfreezing” and then fitting two cycles of training using `fit_one_cycle()`. Here the largest learning rate between 1e-6 and 1e-4 is passed because that’s when the slope of the learning rate plot started to get steep in the previous section.

learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4))

The above output shows that the losses are now reduced and you have a pretty accurate model ready to be used for predictions.

First export the weights, metadata and the model to a location on your hard-drive using `.export()`. This saves the model as a file named “export.pkl”.

learn.path = Path("/kernel")
learn.export()

Woo Hoo! It’s Prediction Time

Congratulations! you have used fastai magic to successfully create and train your very first neural network. Now it’s time to put the model to use.

The steps that you need to follow for prediction looks as below-

You don’t need your gpu for prediction. So, ask fastai not to use gpu with this code — torch.device(‘cpu’)
Use open_image() to Pick any one of the images from the “TEST” folder.
Use load_learner(path) to load the saved model, make sure that the “export.pkl” exists in the “path”.
Use predict() to do the prediction. The predict() outputs three things — predicted class, label to which the class belongs, probability of the prediction.

testImageList = ImageList.from_folder(pathTest/'O')
image = open_image(testImageList.items[0])
image

model = load_learner(learn.path)
className, label, probability = model.predict(image)className, label, probability
-----
Output
-----
Category O, tensor(0), tensor([0.9983, 0.0017])

Conclusion

The model is pretty good at this stage, but there’s still scope of improvement here. I would love to see how much more you have improved this model and the code in this notebook. So, do give it a try and host your code at github and link it back in the comment section.

I have not used any mathematical terms here as I believe that teaching the concepts of applied machine learning should be more focused on the code that materializes the mathematics behind the design of these algorithms rather than teaching the math itself.

Feel free to fork and share the notebook for this project is available at github and the kaggle kernel is available here.

Announcement

I am so excited to announce that my first deep learning course is now available on Udemy . The course is available at 95% discount till 31st May midnight. Use this link to apply the discount code.