PyTorch Zero to Hero (Training a basic CNN model) ~ 4
Welcome back to the fourth installment of the series PyTorch Zero to Hero. Today we will go through PyTorch’s Tensor library and neural networks at a high level and then jump onto training a small neural network to classify images.
If you want to take a look at previous blogs in this series, you can access them over here:
Now let’s get started and create a over very own sm,all image classifier.
Flow of the code implementation for making a classification pipeline will something like this:
1. Load and preprocess data
2. Define a CNN model
3. Define a loss function
4. Train the network on the training data
5. Test the network on the test data
1. Load and preprocess data
* Import necessary libraries
* Define transforms
Transforms are used for image transformations, which can be chained together using Compose. All transformations accept PIL Image, Tensor Image (C,H,W
) or batch of Tensor Images as input (B,C,H,W
) where C
is a number of channels, H
as image height and W
as image width, and B
is the number of images in the batch.
torchvision.transforms.ToTensor
Convert aPIL Image
ornumpy.ndarray
in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
torchvision.transforms.Normalize
(mean, std, inplace=False) : Normalize a tensor image with mean and standard deviation.Here mean is Sequence of means for each channel and std is Sequence of standard deviations for each channel.
* Download Data
Using torchvision.datasets
we download the CIFAR10 dataset. The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class.
We have successfully loaded the dataset into the DataLoader, allowing us to iterate through it as necessary torch.utils.data.DataLoader
. Using thisshuffle=True
, the data will be shuffled after each complete pass through all the batches.
* Define classes
* Set the device
2. Define a Convolutional Neural Network
torch.nn
is a module in PyTorch that provides tools to create and train neural networks. It includes predefined layers, loss functions, and utilities to build complex neural network architectures easily. This module simplifies the process of defining neural network components and connecting them together.
Conv2d
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=’zeros’, device=None, dtype=None)
MaxPool2d
torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
You can visualize it over here.
Linear
torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)
Applies a linear transformation to the incoming data
ReLU
torch.nn.ReLU(inplace=False)
Applies the rectified linear unit function element-wise
3. Define a Loss function and Optimizer
CrossEntropyLoss
torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction=’mean’, label_smoothing=0.0)
Computes the cross entropy loss between input logits and target
SGD
torch.optim.SGD(params, lr=0.001, momentum=0, dampening=0, weight_decay=0, nesterov=False, *, maximize=False, foreach=None, differentiable=False, fused=None)
Used to implement Stochastic gradient descent
4. Train the network
[1, 2000] loss: 2.303
[1, 4000] loss: 2.300
[1, 6000] loss: 2.298
[1, 8000] loss: 2.294
[1, 10000] loss: 2.281
[1, 12000] loss: 2.226
[2, 2000] loss: 2.091
[2, 4000] loss: 2.035
[2, 6000] loss: 1.951
[2, 8000] loss: 1.894
[2, 10000] loss: 1.834
[2, 12000] loss: 1.808
Finished Training
Save the model
Model's state_dict:
conv1.weight torch.Size([6, 3, 5, 5])
conv1.bias torch.Size([6])
conv2.weight torch.Size([16, 6, 5, 5])
conv2.bias torch.Size([16])
fc1.weight torch.Size([120, 400])
fc1.bias torch.Size([120])
fc2.weight torch.Size([84, 120])
fc2.bias torch.Size([84])
fc3.weight torch.Size([10, 84])
fc3.bias torch.Size([10])
Optimizer's state_dict:
state {0: {'momentum_buffer': None}, 1: {'momentum_buffer': None}, 2: {'momentum_buffer': None}, 3: {'momentum_buffer': None}, 4: {'momentum_buffer': None}, 5: {'momentum_buffer': None}, 6: {'momentum_buffer': None}, 7: {'momentum_buffer': None}, 8: {'momentum_buffer': None}, 9: {'momentum_buffer': None}}
param_groups [{'lr': 0.001, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'differentiable': False, 'params': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}]
A state_dict is a Python dictionary that maps each layer to its parameter tensors. It includes entries only for layers with learnable parameters (such as convolutional and linear layers) and registered buffers (like the running_mean in batch normalization).
Load the model
Once you save the model, we don’t need to train the model everytime, just load the saved models and proceed.
<All keys matched successfully>
5. Test the network
Calculate predictions for each class
no_grad
It disables gradient calculation which will reduce memory consumption for computations that is useful for inference
torch.max
Maximum value of all elements in the input
tensor is returned
Visualize results
Tadahhh…!!!! Done with the training of the basic CNN. The idea behind this blog is to get an understanding of PyTorch’s Tensor. Yeah, the results don't seem to be interesting but we didn't even try to make a perfect model. Moving forward, we will cover more state-of-the-art architectures for classification models and explore them.
Thanks for your patience; will meet soon in another blog.