PyTorch Convolutional Neural Network With MNIST Dataset

Nutan
7 min readMay 21, 2021

--

We are going to use PYTorch and create CNN model step by step. Then we will train the model with training data and evaluate the model with test data.

Import libraries

import torch

Check available device

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

Output: device(type=’cpu’)

Download MNIST dataset

What is MNIST dataset?

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning.

The MNIST database contains 60,000 training images and 10,000 testing images.

PyTorch domain libraries provide a number of pre-loaded datasets (such as FashionMNIST, MNIST etc…) that subclass torch.utils.data.Dataset and implement functions specific to the particular data. They can be used to prototype and benchmark your model. In this example we are using MNIST dataset.

Download MNIST dataset in local system

from torchvision import datasets
from torchvision.transforms import ToTensor
train_data = datasets.MNIST(
root = 'data',
train = True,
transform = ToTensor(),
download = True,
)
test_data = datasets.MNIST(
root = 'data',
train = False,
transform = ToTensor()
)

Print train_data and test_data size

print(train_data)

Output:

print(test_data)

Output:

print(train_data.data.size())

Output: torch.Size([60000, 28, 28])

print(train_data.targets.size())

Output: torch.Size([60000])

Visualization of MNIST dataset

Plot one train_data

import matplotlib.pyplot as pltplt.imshow(train_data.data[0], cmap='gray')
plt.title('%i' % train_data.targets[0])
plt.show()

Output:

Plot multiple train_data

figure = plt.figure(figsize=(10, 8))
cols, rows = 5, 5
for i in range(1, cols * rows + 1):
sample_idx = torch.randint(len(train_data), size=(1,)).item()
img, label = train_data[sample_idx]
figure.add_subplot(rows, cols, i)
plt.title(label)
plt.axis("off")
plt.imshow(img.squeeze(), cmap="gray")
plt.show()

Output:

Preparing data for training with DataLoaders

The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in “minibatches”, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval.

DataLoader is an iterable that abstracts this complexity for us in an easy API.

from torch.utils.data import DataLoaderloaders = {
'train' : torch.utils.data.DataLoader(train_data,
batch_size=100,
shuffle=True,
num_workers=1),

'test' : torch.utils.data.DataLoader(test_data,
batch_size=100,
shuffle=True,
num_workers=1),
}
loaders

Define the Convolutional Neural Network model

A Convolutional Neural Network is type of neural network that is used mainly in image processing applications.

Let us create convolution neural network using torch.nn.Module. torch.nn.Module will be base class for all neural network modules. We will use 2 fully convolutional layers, Relu activation function and MaxPooling.

Conv2d: Applies a 2D convolution over an input signal composed of several input planes.

Parameters

in_channels (int) — Number of channels in the input image

out_channels (int) — Number of channels produced by the convolution

kernel_size (int or tuple) — Size of the convolving kernel

stride (int or tuple, optional) — Stride of the convolution. Default: 1

padding (int or tuple, optional) — Zero-padding added to both sides of the input. Default: 0

padding_mode (string, optional) — ‘zeros’, ‘reflect’, ‘replicate’ or ‘circular’. Default: ‘zeros’

dilation (int or tuple, optional) — Spacing between kernel elements. Default: 1

groups (int, optional) — Number of blocked connections from input channels to output channels. Default: 1

bias (bool, optional) — If True, adds a learnable bias to the output. Default: True

import torch.nn as nnclass CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(
in_channels=1,
out_channels=16,
kernel_size=5,
stride=1,
padding=2,
),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2),
)
self.conv2 = nn.Sequential(
nn.Conv2d(16, 32, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2d(2),
)
# fully connected layer, output 10 classes
self.out = nn.Linear(32 * 7 * 7, 10)
def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
# flatten the output of conv2 to (batch_size, 32 * 7 * 7)
x = x.view(x.size(0), -1)
output = self.out(x)
return output, x # return x for visualization

in_channels=1: because our input is a grayscale image.

Stride: is the number of pixels to pass at a time when sliding the convolutional kernel.

Padding: to preserve exactly the size of the input image, it is useful to add a zero padding on the border of the image.

kernel_size: we need to define a kernel which is a small matrix of size 5 * 5. To perform the convolution operation, we just need to slide the kernel along the image horizontally and vertically and do the dot product of the kernel and the small portion of the image.

The forward() pass defines the way we compute our output using the given layers and functions.

cnn = CNN()
print(cnn)

Output:

Define loss function

loss_func = nn.CrossEntropyLoss()   
loss_func

Output: CrossEntropyLoss()

Define a Optimization Function

lr(Learning Rate): Rate at which our model updates the weights in the cells each time back-propagation is done.

from torch import optimoptimizer = optim.Adam(cnn.parameters(), lr = 0.01)   
optimizer

Output:

Train the model

Create a function called train() and pass num of epochs, model and data loaders as input parameters.

num_epochs: Number of times our model will go through the entire training dataset

from torch.autograd import Variablenum_epochs = 10def train(num_epochs, cnn, loaders):

cnn.train()

# Train the model
total_step = len(loaders['train'])

for epoch in range(num_epochs):
for i, (images, labels) in enumerate(loaders['train']):

# gives batch data, normalize x when iterate train_loader
b_x = Variable(images) # batch x
b_y = Variable(labels) # batch y
output = cnn(b_x)[0]
loss = loss_func(output, b_y)

# clear gradients for this training step
optimizer.zero_grad()

# backpropagation, compute gradients
loss.backward()
# apply gradients
optimizer.step()

if (i+1) % 100 == 0:
print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'
.format(epoch + 1, num_epochs, i + 1, total_step, loss.item()))
pass

pass


pass
train(num_epochs, cnn, loaders)

Output:

Epoch [1/10], Step [100/600], Loss: 0.0725
Epoch [1/10], Step [200/600], Loss: 0.2758
Epoch [1/10], Step [300/600], Loss: 0.0742
Epoch [1/10], Step [400/600], Loss: 0.0744
Epoch [1/10], Step [500/600], Loss: 0.0035
Epoch [1/10], Step [600/600], Loss: 0.1458
Epoch [2/10], Step [100/600], Loss: 0.0281
Epoch [2/10], Step [200/600], Loss: 0.0584
Epoch [2/10], Step [300/600], Loss: 0.0605
Epoch [2/10], Step [400/600], Loss: 0.1782
Epoch [2/10], Step [500/600], Loss: 0.0324
Epoch [2/10], Step [600/600], Loss: 0.0918
Epoch [3/10], Step [100/600], Loss: 0.0430
Epoch [3/10], Step [200/600], Loss: 0.0368
Epoch [3/10], Step [300/600], Loss: 0.0009
Epoch [3/10], Step [400/600], Loss: 0.0647
Epoch [3/10], Step [500/600], Loss: 0.0370
Epoch [3/10], Step [600/600], Loss: 0.0286
Epoch [4/10], Step [100/600], Loss: 0.0905
Epoch [4/10], Step [200/600], Loss: 0.0638
Epoch [4/10], Step [300/600], Loss: 0.0238
Epoch [4/10], Step [400/600], Loss: 0.0564
Epoch [4/10], Step [500/600], Loss: 0.0117
Epoch [4/10], Step [600/600], Loss: 0.0069
Epoch [5/10], Step [100/600], Loss: 0.0014
Epoch [5/10], Step [200/600], Loss: 0.0449
Epoch [5/10], Step [300/600], Loss: 0.0050
Epoch [5/10], Step [400/600], Loss: 0.0534
Epoch [5/10], Step [500/600], Loss: 0.0100
Epoch [5/10], Step [600/600], Loss: 0.1055
Epoch [6/10], Step [100/600], Loss: 0.1483
Epoch [6/10], Step [200/600], Loss: 0.0018
Epoch [6/10], Step [300/600], Loss: 0.0101
Epoch [6/10], Step [400/600], Loss: 0.0188
Epoch [6/10], Step [500/600], Loss: 0.0239
Epoch [6/10], Step [600/600], Loss: 0.0323
Epoch [7/10], Step [100/600], Loss: 0.0085
Epoch [7/10], Step [200/600], Loss: 0.0767
Epoch [7/10], Step [300/600], Loss: 0.0313
Epoch [7/10], Step [400/600], Loss: 0.0518
Epoch [7/10], Step [500/600], Loss: 0.0098
Epoch [7/10], Step [600/600], Loss: 0.1183
Epoch [8/10], Step [100/600], Loss: 0.1086
Epoch [8/10], Step [200/600], Loss: 0.0024
Epoch [8/10], Step [300/600], Loss: 0.0949
Epoch [8/10], Step [400/600], Loss: 0.0502
Epoch [8/10], Step [500/600], Loss: 0.0689
Epoch [8/10], Step [600/600], Loss: 0.0637
Epoch [9/10], Step [100/600], Loss: 0.0540
Epoch [9/10], Step [200/600], Loss: 0.0826
Epoch [9/10], Step [300/600], Loss: 0.0013
Epoch [9/10], Step [400/600], Loss: 0.0168
Epoch [9/10], Step [500/600], Loss: 0.0046
Epoch [9/10], Step [600/600], Loss: 0.0419
Epoch [10/10], Step [100/600], Loss: 0.0141
Epoch [10/10], Step [200/600], Loss: 0.1218
Epoch [10/10], Step [300/600], Loss: 0.0629
Epoch [10/10], Step [400/600], Loss: 0.0056
Epoch [10/10], Step [500/600], Loss: 0.0661
Epoch [10/10], Step [600/600], Loss: 0.0096

Evaluate the model on test data

We must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference.

model.train() tells your model that you are training the model. So effectively layers like dropout, batchnorm etc. which behave different on the train and test procedures know what is going on and hence can behave accordingly.

You can call either model.eval() or model.train(mode=False) to tell that you are testing the model.

def test():
# Test the model
cnn.eval()
with torch.no_grad():
correct = 0
total = 0
for images, labels in loaders['test']:
test_output, last_layer = cnn(images)
pred_y = torch.max(test_output, 1)[1].data.squeeze()
accuracy = (pred_y == labels).sum().item() / float(labels.size(0))
pass
print('Test Accuracy of the model on the 10000 test images: %.2f' % accuracy)

pass
test()

Output: Test Accuracy of the model on the 10000 test images: 0.97

Print 10 predictions from test data

sample = next(iter(loaders['test']))
imgs, lbls = sample

actual_number = lbls[:10].numpy()
actual_number

Output: array([3, 7, 3, 2, 4, 0, 3, 9, 6, 0])

test_output, last_layer = cnn(imgs[:10])
pred_y = torch.max(test_output, 1)[1].data.numpy().squeeze()
print(f'Prediction number: {pred_y}')
print(f'Actual number: {actual_number}')

Output:

Prediction number: [3 7 3 2 4 0 3 9 6 0]
Actual number: [3 7 3 2 4 0 3 9 6 0]

--

--

Nutan

knowledge of Machine Learning, React Native, React, Python, Java, SpringBoot, Django, Flask, Wordpress. Never stop learning because life never stops teaching.