PyTorch Zero to Hero (Training a basic CNN model) ~ 4

Abhishek Selokar
5 min readJun 22, 2024

--

Welcome back to the fourth installment of the series PyTorch Zero to Hero. Today we will go through PyTorch’s Tensor library and neural networks at a high level and then jump onto training a small neural network to classify images.

If you want to take a look at previous blogs in this series, you can access them over here:

Now let’s get started and create a over very own sm,all image classifier.

Flow of the code implementation for making a classification pipeline will something like this:

1. Load and preprocess data

2. Define a CNN model

3. Define a loss function

4. Train the network on the training data

5. Test the network on the test data

1. Load and preprocess data

* Import necessary libraries

* Define transforms

Transforms are used for image transformations, which can be chained together using Compose. All transformations accept PIL Image, Tensor Image (C,H,W) or batch of Tensor Images as input (B,C,H,W) where C is a number of channels, H as image height and W as image width, and B is the number of images in the batch.

torchvision.transforms.ToTensorConvert a PIL Image or numpy.ndarray in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]

torchvision.transforms.Normalize(mean, std, inplace=False) : Normalize a tensor image with mean and standard deviation.Here mean is Sequence of means for each channel and std is Sequence of standard deviations for each channel.

* Download Data

Using torchvision.datasets we download the CIFAR10 dataset. The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class.

Source

We have successfully loaded the dataset into the DataLoader, allowing us to iterate through it as necessary torch.utils.data.DataLoader. Using thisshuffle=True, the data will be shuffled after each complete pass through all the batches.

* Define classes

* Set the device

2. Define a Convolutional Neural Network

torch.nn is a module in PyTorch that provides tools to create and train neural networks. It includes predefined layers, loss functions, and utilities to build complex neural network architectures easily. This module simplifies the process of defining neural network components and connecting them together.

Conv2d

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=’zeros’, device=None, dtype=None)

Source
Source

MaxPool2d

torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

Source

You can visualize it over here.

Linear

torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

Applies a linear transformation to the incoming data

Source

ReLU

torch.nn.ReLU(inplace=False)

Applies the rectified linear unit function element-wise

Source

3. Define a Loss function and Optimizer

CrossEntropyLoss

torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=’mean’, label_smoothing=0.0)

Computes the cross entropy loss between input logits and target

SGD

torch.optim.SGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach=None, differentiable=False, fused=None)

Used to implement Stochastic gradient descent

4. Train the network

[1,  2000] loss: 2.303
[1, 4000] loss: 2.300
[1, 6000] loss: 2.298
[1, 8000] loss: 2.294
[1, 10000] loss: 2.281
[1, 12000] loss: 2.226
[2, 2000] loss: 2.091
[2, 4000] loss: 2.035
[2, 6000] loss: 1.951
[2, 8000] loss: 1.894
[2, 10000] loss: 1.834
[2, 12000] loss: 1.808
Finished Training

Save the model

Model's state_dict:
conv1.weight torch.Size([6, 3, 5, 5])
conv1.bias torch.Size([6])
conv2.weight torch.Size([16, 6, 5, 5])
conv2.bias torch.Size([16])
fc1.weight torch.Size([120, 400])
fc1.bias torch.Size([120])
fc2.weight torch.Size([84, 120])
fc2.bias torch.Size([84])
fc3.weight torch.Size([10, 84])
fc3.bias torch.Size([10])

Optimizer's state_dict:
state {0: {'momentum_buffer': None}, 1: {'momentum_buffer': None}, 2: {'momentum_buffer': None}, 3: {'momentum_buffer': None}, 4: {'momentum_buffer': None}, 5: {'momentum_buffer': None}, 6: {'momentum_buffer': None}, 7: {'momentum_buffer': None}, 8: {'momentum_buffer': None}, 9: {'momentum_buffer': None}}
param_groups [{'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'differentiable': False, 'params': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}]

A state_dict is a Python dictionary that maps each layer to its parameter tensors. It includes entries only for layers with learnable parameters (such as convolutional and linear layers) and registered buffers (like the running_mean in batch normalization).

Load the model

Once you save the model, we don’t need to train the model everytime, just load the saved models and proceed.

<All keys matched successfully>

5. Test the network

Calculate predictions for each class

no_grad

It disables gradient calculation which will reduce memory consumption for computations that is useful for inference

torch.max

Maximum value of all elements in the input tensor is returned

Source

Visualize results

Tadahhh…!!!! Done with the training of the basic CNN. The idea behind this blog is to get an understanding of PyTorch’s Tensor. Yeah, the results don't seem to be interesting but we didn't even try to make a perfect model. Moving forward, we will cover more state-of-the-art architectures for classification models and explore them.

Thanks for your patience; will meet soon in another blog.

References

--

--

Abhishek Selokar

Masters Student @ Indian Institute Of Technology, Kharagpur || Thirsty to learn more about AI