PyTorch 1.2 Quickstart with Google Colab

In this tutorial, we will learn how to quickly train a deep learning model to understand some of PyTorch’s basic building blocks.

elvis
DAIR.AI
6 min readAug 26, 2019

--

Following the success of previous deep learning tutorials like “Building RNNs is Fun with PyTorch and Google Colab” and “A Simple Neural Network from Scratch with PyTorch and Google Colab”, I am excited to introduce a new series of tutorials on all things PyTorch and deep learning.

In this first code tutorial, we will learn how to quickly train a deep learning model to understand some of PyTorch’s basic building blocks. This notebook is inspired by the “Tensorflow 2.0 Quickstart for experts” notebook.

After completion of this tutorial, you should be able to import data, transform it, and efficiently feed the data in batches to a convolution neural network (CNN) model for image classification.

A new feature in these new tutorials is the introduction of exercises. Look out for the 📝 icon and try to complete these exercises, which are carefully designed to help you further understand/research important machine learning concepts. If you complete them, send me a link of your solutions and I will feature the best or most creative solutions on this blog post and my other mediums. Your solution can be a GitHub repo or a Google Colab notebook.

Let’s Get Started!

I am using PyTorch 1.2.0, which is the latest release as of the time of this tutorial. We are using Google Colab to run all our code and I have provided a link to the notebook at the end of this post. Let’s install the latest version of PyTorch:

Now let’s import the necessary libraries:

Import The Data

The first step before training the model is to import the data. We will use the MNIST dataset which is like the Hello World dataset of machine learning.

Besides importing the data, we will also do a few more things:

  • We will transform the data into tensors using the transforms module
  • We will use DataLoader to build convenient data loaders or what are referred to as iterators, which makes it easy to efficiently feed data in batches into deep learning models.
  • As hinted above, we will also create batches of the data by setting the batch parameter inside the data loader. Notice we use batches of 32 in this tutorial but you can change it to any size you like. I encourage you to experiment with different batches such as 128 and 64.

Exploring the Data

As a practitioner and researcher, I am always spending a bit of time and effort exploring and understanding the dataset. It’s fun and this is a good practice to ensure that everything is in order.

Let’s check what the train and test dataset contains. I will use matplotlib to print out some of the images from our dataset.

The code above outputs the following:

📝EXERCISE: Try to figure out what the code above is doing. This will help you to better understand and explore your dataset before moving forward.

Let’s check the dimensions of a batch:

You should see something like this as output:

Image batch dimensions: torch.Size([32, 1, 28, 28]) 
Image label dimensions: torch.Size([32])

We know our images are of 28 x 28 (height x width) and each batch contains 32 samples.

The Model

Using a canonical deep learning framework pipeline, let’s build a one 2D convolutional layer model.

Here are a few notes for PyTorch beginners:

  • The model below consists of an __init__() portion which is where you include the layers and components of the neural network. In our model, we have a convolutional layer denoted by nn.Conv2d(...). We are dealing with an image dataset that is in grayscale so we only need one channel going in, hence in_channels=1. We hope to get a nice representation of this layer, so we use out_channels=32. Kernel size is 3, and for the rest of parameters, we use the default values which you can find here.
  • We use two back-to-back dense layers or what we refer to as linear transformations to the incoming data. Notice for d1 I have a dimension which looks like it came out of nowhere 😕. 128 represents the size we want as output and the (26*26*32) represents the dimension of the incoming data. If you would like to find out how to calculate those numbers refer to the PyTorch documentation. In short, the convolutional layer transforms the input data into a specific dimension that has to be considered in the next dense layer. The same applies for the second dense layer (d2) where the dimension of the output of the previous linear layer was added as in_features=128, and 10 is just the size of the output which also corresponds to the number of classes.
  • After each one of those layers, we also apply an activation function such as ReLU. For prediction purposes, we then apply a softmax layer to the last transformation and return the output of that.

Here is the code for the model:

Here is the formula to calculate the output of the self.conv layer:

📝EXERCISE: Try to work out the math and confirm that H_out and W_out correspond to the values we are using as input feature size to the self.d1 linear layer.

As I have done in my previous tutorials, I always encourage to test the model with one batch to ensure that the output dimensions are what we expect.

The output is as follows and everything looks good:

batch size: torch.Size([32, 1, 28, 28]) 
torch.Size([32, 10])

Training the Model

Now we are ready to train the model but before that, we are going to define a loss function, an optimizer, and our own function to compute the accuracy of the model.

Now it’s time for training.

The output after training:

Epoch: 0 | Loss: 1.4901 | Train Accuracy: 96.97 
Epoch: 1 | Loss: 1.4808 | Train Accuracy: 97.90
Epoch: 2 | Loss: 1.4767 | Train Accuracy: 98.34
Epoch: 3 | Loss: 1.4748 | Train Accuracy: 98.55
Epoch: 4 | Loss: 1.4725 | Train Accuracy: 98.81

Our model seems to be doing very well for training as its accuracy is high and the loss keeps decreasing.

We can also compute accuracy on the testing dataset to see how well the model performs on the image classification task. As you can see below, our basic CNN model is performing very well on the MNIST classification task.

The corresponding output:

Test Accuracy: 98.04

📝EXERCISE: As a way to practice, try to embed the test code above inside the code where I was computing the training accuracy, so that you can also keep testing the model on the testing data as you proceed with the training steps. This is useful as sometimes you don’t want to wait until your model has completed training to actually test the model with the testing data.

Final Words

That’s it for this tutorial! Congratulations 👏! You are now able to implement a basic CNN model in PyTorch for image classification. You also learned how to use some of the basic building blocks in PyTorch. I encourage you to further extend the CNN model by adding more convolution layers combined with max-pooling layers, but as you saw, you don’t really need it here as the results look good. If you are interested in implementing a similar image classification model using RNNs see the references below.

References

--

--