Pipeline for every PyTorch Image Classification Problem / Training Model

→ Guide to Training PyTorch Image Classification Models

siromer
5 min readApr 6, 2024
  • In the first part of this series(link), I discussed how to process image data and convert it into a format that PyTorch expects.
    In this part, I will train a custom image classification model.
Observing Model Accuracy

I am not going to talk about how to obtain , process and analyze the image data here. If you are interested in learning about these steps, I strongly recommend reading the first part of this pipeline (link)

For every image classification task , I follow these 6 main steps and in this article I am going to discuss about last 3 parts.

  1. Creating Dataset (first part)
  2. Visualization of Example Images (first part)
  3. Visualization of Class Distribution (first part)
  4. Create functions for training model (this article)
  5. Create Model (this article)
  6. Train Model (this article)

Important Note : train_set and validation_set are data loaders that wrap around iterable datasets and I created them in previous article. I will use these two data loaders for training.

4. Create Functions for Training Model

To train a model with PyTorch, you need to create your own functions. PyTorch does not provide a function similar to “.fit()” in Tensorlow . I am going to write 2 function for training the model , one for training , one for validation.

In these functions :

  • the Loss is calculated as the average batch loss .
  • the Accuracy is calculated as the percentage of correct predictions out of the total number of samples.
# I am going to add accuracies to these lists and I will use them outside of this function 
train_accuracies=[]
validation_accuracies=[]

# Function for training
def train(dataloader, model, loss_fn, optimizer, epoch):

size = len(dataloader.dataset) # total number of images inside of loader
num_batches = len(dataloader) # number of batches

model.train()

train_loss, correct = 0, 0


for batch, (X, y) in enumerate(dataloader):
# move X and y to GPU for faster training
X, y = X.to(device), y.to(device)

# make prediction
pred = model(X)
# calculate loss
loss = loss_fn(pred, y)

# Backpropagation
loss.backward() # compute parameters gradients
optimizer.step() # update parameters
optimizer.zero_grad() # reset the gradients of all parameters

# Update training loss
train_loss += loss.item() # item() method extracts the loss’s value as a Python float

# Calculate training accuracy
correct += (pred.argmax(1) == y).type(torch.float).sum().item()

# loss and accuracy
train_loss = train_loss / num_batches
accuracy = 100 * correct / size

# use this accuracy list for plotting accuracy with matplotlib
train_accuracies.append(accuracy)

# Print training accuracy and loss at the end of epoch
print(f" Training Accuracy: {accuracy:.2f}%, Training Loss: {train_loss:.4f}")
# function for validation 
def validation(dataloader, model, loss_fn,t):

size = len(dataloader.dataset) # total number of images inside of loader
num_batches = len(dataloader) # number of batches

validation_loss, correct = 0, 0

# sets the PyTorch model to evaluation mode, it will disable dropout layer
model.eval()

with torch.no_grad(): # disable gradient calculation
for X, y in dataloader:

# move X and y to GPU for faster training
X, y = X.to(device), y.to(device)
pred = model(X) # make prediction
validation_loss += loss_fn(pred, y).item()

# if prediction is correct add 1 to correct variable.
correct += (pred.argmax(1) == y).type(torch.float).sum().item()

# loss and accuracy
validation_loss /= num_batches
accuracy = 100 * correct / size

validation_accuracies.append(accuracy)

# Print test accuracy and loss at the end of epoch
print(f" Validation Accuracy: {accuracy:.2f}%, Validation Loss: {validation_loss:.4f}")

I will use this two functions when training the model .

5. Create Model

  • Below, I have explained the meanings of all the layers, what they do, the output dimensions, and how the model works. I strongly recommend you to read the comment blocks.
import torch
# if GPU is available , use it while training
device = "cuda" if torch.cuda.is_available() else "cpu"
device
import torch
import torch.nn as nn


class SimpleCNN(nn.Module):
def __init__(self, num_classes=9):
super(SimpleCNN, self).__init__()

# image size is --> (3,180,180)

# convolutional layer with 32 filter ,input dimension is 3 because image has 3 channels
self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
# activation function , it adds introduces non-linearity to the model , thus it helps to model to learn complex functions .
self.act1 = nn.ReLU()
# it reduces pixel number (90,90)
self.pool1 = nn.MaxPool2d(2)

self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.act2 = nn.ReLU()
self.pool2 = nn.MaxPool2d(2) # --> (45,45)

self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
self.act3 = nn.ReLU()
self.pool3 = nn.MaxPool2d(2) # --> (22,22)

# first flatten the channels and then feed them into the fully connected layer. Given the input shape of (128, 22, 22), flattening it results in 128 * 22 * 22.
self.fc1 = nn.Linear(128 * 22 * 22 , 256)
self.act4 = nn.ReLU()

# dropout drops randomly neurons , here %20 of neurons are dropped randomly . It helps to prevent overfitting
self.dropout=nn.Dropout(p=0.2)

# The nn.Linear layer with input size 256 and output size 9 represents the output layer of our neural network.
# Since we have 9 classes, the output of this layer will be passed through a softmax activation function.
# (error function internally applies softmax activation ,you dont need to add it to here)
# This converts the raw outputs into probabilities, representing the likelihood of each class.
# These probabilities are then used to calculate the error during trainin
self.fc2 = nn.Linear(256, 9)


def forward(self, x):

# add outputs on top of each layer and return out in the end
out = self.pool1(self.act1(self.conv1(x)))
out = self.pool2(self.act2(self.conv2(out)))
out = self.pool3(self.act3(self.conv3(out)))

out = out.view(out.size(0), -1)

out = self.act4(self.fc1(out))
out=self.dropout(out)
out=self.fc2(out)

return out

# create model
model = SimpleCNN()

model.to(device)
Output

6. Train Model

  • Training may take long time depending on your GPU
# Loss funciton and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
# epoch number 
epochs = 32

# loop for training model
for t in range(epochs):
print(f"Epoch {t+1}")
train(train_set, model, loss_fn, optimizer,t)
validation(validation_set, model, loss_fn,t)
print("----------------------------")
print("Done!")
Training

Visualization of accuracy

import matplotlib.pyplot as plt 

def visualize(train_accuracies,validation_accuracies):
epoch_number=len(train_accuracies)

plt.plot(range(1,epoch_number+1),train_accuracies,'r', label='Training accuracy')
plt.plot(range(1,epoch_number+1),validation_accuracies,'b', label='Validation accuracy')
plt.legend()
plt.xlabel("Epoch Number")
plt.ylabel("Accuracies")
plt.grid()
# Remember , this 2 parameters are lists , I created them above and append values to them regularly
visualize(train_accuracies,validation_accuracies)

Result seems good , %86 accuracy without overfitting .

--

--