Hand-written Digits Classification using Deep Learning on MNIST Dataset

Aditya Singh Rathore
Nov 5 · 4 min read

Hi,

Today I will be training a model that will recognize hand written digits from 0 to 9 using MNIST dataset. I will be using Fast.ai library to train a deep learning model on the above dataset.

Getting the Dataset:

path = untar_data(URLs.MNIST)

The function Untar_data requires a url.

To check the url for a particular dataset, use the below:

URLs.MNIST

We can check the contents of the dataset using path.ls()

Below is the output:

[PosixPath(‘/content/data/mnist_png/training’),

PosixPath(‘/content/data/mnist_png/testing’)]

As we can see there are 2 directories for training and testing datasets.

We have to keep in mind that we don’t have a separate validation dataset, so we will have to create it using the below code. Here I am choosing validation dataset of 20%.

data= ImageDataBunch.from_folder(path= path, train=’training’, test= ‘testing’, valid_pct=0.2 )

Let’s visualize the data that we just created:

data.show_batch(rows=7,figsize= (8,5))

Output

To check the total number of categories: data.c

To check the category names: data.classes

Now lets train a model using transfer learning.

learn = cnn_learner(data,base_arch=models.resnet34, metrics= [error_rate, accuracy])

Cnn_learner is a method that resides in fastai.vision.learner class.

We specify the data that is going to be used, the pretrained model which is resnet34 as the base architecture and accuracy as metric to check our model performance

Now we will apply One Cycle Policy which is basically the number of times the model will look at every training example. It should be noted that each time the model will get better at recognising the training examples, so one should look out for the chances of overfitting. Here I will train 4 times.

learn.fit_one_cycle(4)

Output

The above will create some weights which we will save by the below code:

learn.save(‘stage1’)

Now I will use ClassificationInterpretation class to generate Confusion Matrix and plotting misclassified images. I will pass the learn model as the object.

inter= ClassificationInterpretation.from_learner(learn)

Now I will plot the top losses using below:

inter.plot_top_losses(9, figsize=(15, 11))

Output

Now I will plot the confusion matrix:

inter.plot_confusion_matrix()

Output

Confusion matrix tells you how many times the model predicted a food as its actual category or wrong category.

You can check the classes on which the model performed really bad by using the below:

inter.most_confused(min_val=2)

Below is just a part of a long list:

[(‘2’, ‘8’, 30),

(‘5’, ‘3’, 22),

(‘7’, ‘9’, 21),

(‘3’, ‘5’, 15),

(‘8’, ‘9’, 15),

(‘3’, ‘8’, 14),

(‘5’, ‘8’, 14),

(‘4’, ‘9’, 13),

(‘8’, ‘2’, 13),

(‘9’, ‘8’, 13),

(‘2’, ‘3’, 12),

(‘9’, ‘7’, 12),

(‘3’, ‘2’, 11),

(‘4’, ‘7’, 9),

Now I will find the learning rate so that I can choose the best one using learn.lr_find() and plot the same using the below:

learn.recorder.plot()

plt.title(“Loss Vs Learning Rate”)

Output

The learning rate shows how fast we are updating the parameters in our model. This is the point where we fine tune our model. We will attempt to improve our model by carefully choosing the learning rate. Here we can see that after 1e-04 the loss keeps on increasing. So we will select our range from e-6 to e-4.

Now we will use the above information to train all the layers in our model and not just the ones we added at the end of the architecture that was using imagenet backend. TO do that we use the below code:

learn.unfreeze(): This basically says train the whole model.

Now we will again fit the model but this with our learning rate included:

learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4)).

Slice(): It is a python function that spreads the range of values across the model, meaning it will use the initial values at the beginning of the layers and the end values for the end layers.

And as expected, our model is more accurate:

Output

So this concludes my task. Please give claps if you liked it. Thank you and have a nice day

Aditya Singh Rathore

Written by

Hi, I am a Machine Learning Enthusiast who is learning something new everyday!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade