My first deep learning model using PyTorch

Pooja Mahajan
Analytics Vidhya
Published in
5 min readJul 19, 2020

In continuation of my previous posts , Getting started with Deep Learning and Max Pooling, in this post I will be building a simple convolutional neural network in Pytorch.

I will be using FMNIST dataset .The Fashion MNIST dataset is made up of images from fashion chain: Zalando. It contains a training set of 60,000 images and a test set of 10,000 images. Each image is a 28 x 28 pixels in size and associated with the label from 10 classes. It is similar to MNIST dataset which is often called as “Hello World” of image recognition.

:D

1. Importing required libraries

  • torch — imports pytorch
  • torch.nn — torch.nn module provided by pytorch to create and train neural networks
  • torch.nn.functional- this specifically provides access to handy functions for direct use , for example — Relu or “rectified linear” activation function for our neurons.
  • torch.optim- package implementing various optimisation algorithms.
  • torchvision- this package consists of popular datasets, model architectures and common image transformations for computer vision
  • torchsummary — used to get model summary in pytorch
  • tqdm —package used to display progress bar for iterations

2. Loading dataset

Dataloaders in pytorch help to make data loading easy. It provides an iterable over the given dataset. In this exercise we will be taking batch size of 64 for both train and test datasets.

Torchvision’s transforms function is used to access various image transformations — in this case , converting to tensor and then normalizing using mean and standard deviation (both are 0.5 in this case).

Lets see how it looks !!

We are trying to load one batch of test dataset . On viewing shape of one batch we got [64,1,28,28] .This means we have 64 examples of 28x28 pixels in grayscale (i.e. no RGB channels). We can plot some of them using matplotlib.

3. Building the network

Now let’s go ahead and build our network. I have used 2-D convolutional layers’ blocks and a Maxpool2d layer in between those blocks to build my network . As activation function, I have chosen rectified linear units . The idea behind using these number of convolutional layers is to reach at-least receptive field equivalent to image size(28 in this case).

The forward() pass defines the way we compute our output using the given sequence of layers and functions.

You can calculate output size after each layer using below formula:-

Model Summary

Model summary is used to understand how many parameters are associated with each layer and how our output shape looks like after each layer . One point to notice is we can think of torch.nn layers as which contain trainable parameters while torch.nn.functional are purely functional with no parameters separately involved.

Now training Pytorch neural network on a GPU is easy. Fortunately for us, Google Colab gives us access to a GPU for free. CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing with its own GPUs. When using GPUs we specify the CUDA device and send all of the inputs and the targets to the GPU.

4. Training the model

First thing we need to set is train mode using model.train() . Iteration over all the training data is done in every epoch. Dataloader handles loading individual batches .

In Pytorch, we need to set the gradients to zero because pytorch accumulates the gradients on subsequent backward passes . Because of this, when you start your training loop, ideally you should zero out the gradients so that parameters are updated correctly.

We then produce the output of our network (forward pass) . Negative log-likelihood loss between the output and the ground truth label is computed. backward() is used to collect a new set of gradients that are propagated back into each of the network’s parameters using optimizer.step().The job of the optimizer is to determine how the network will be updated based on the loss function. Optimizer uses the loss values to update the network’s weights.

model.eval() or model.train(mode=False) is used to tell that you are testing.In our test loop we keep track of correct predictions. Use of torch.no_grad i.e. context-manager disables gradient calculation.

Now finally let’s train this model and see how it works . Here , I have trained this model for 3 epochs with learning rate as 0.01 and momentum set to 0.9.

So that’s it , we finally succeeded in implementing our first neural network in pytorch using simple concepts of convolutions and max pooling as building blocks of our network.

This post was to give you an idea of how to write a neural network using PyTorch's functionality . I have used Google Colab for this exercise . You can find related code in this repository :-

We will dig deep into nitty gritty details of these in coming posts .Till then stay safe!!

--

--

Pooja Mahajan
Analytics Vidhya

Data Scientist. Passionate about problem-solving and creating impactful solutions through AI/ML. LinkedIn — https://www.linkedin.com/in/pooja-mahajan-69b38a98/.