IMAGE CLASSIFICATION USING PYTORCH

4 min readApr 4, 2019

This is an experimental setup to build a code base for PyTorch. Its main aim is to experiment faster using transfer learning on all available pre-trained models. We will be using the plant seedlings classification dataset for this blog-post. This was hosted as a play-ground competition on Kaggle. More details here.

The following pre-trained models are available on PyTorch

resnet18, resnet34, resnet50, resnet101, resnet152
squeezenet1_0, squeezenet1_1
Alexnet
inception_v3
Densenet121, Densenet169, Densenet201
Vgg11, vgg13, vgg16, vgg19, vgg11_bn. vgg13_bn, vgg16_bn, vgg19_bn

The three cases in Transfer Learning and how to solve them using PyTorch

I have already discussed the intuition behind transfer learning in my previous blog post. So I will just mention them here.

Freezing all the layers except the final one
Freezing the first few layers
Fine-tuning the entire network.

This is very much straight forward in PyTorch if you know how the models are structured and wrapped. All the models used above are written differently. Some use Sequential containers, which contain many layers and some directly contain just the layer. So it is important to check how these models are defined in PyTorch.

ResNet and Inception_V3

As mentioned before there are several Resnets and we can use whichever we need. Since the Imagenet dataset has 1000 layers, We need to change the last layer as per our requirement. We can freeze whichever layer we don’t want to train and pass the remaining layer parameters to the optimizer(we will see later).

if resnet:
    model_conv=torchvision.models.resnet50()if inception:
  model_conv=torchvision.models.inception_v3()## Change the last layer
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, n_class)

Let’s check what this model_conv has, In PyTorch, there are children (containers) and each child has several children (layers). Below is the example for resnet50,

for name, child in model_conv.named_children():
    for name2, params in child.named_parameters():
        print(name, name2)## A long list of param are listed, some of them are shown below,
conv1 weight
bn1 weight
bn1 bias
....
fc weight
fc bias

Now if we want to freeze a few layers before training, We can simply do using the following command:

## Freezing all layers
for params in model_conv.parameters():
    params.requires_grad = False## Freezing the first few layers. Here I am freezing the first 7 layersct = 0
for name, child in model_conv.named_children():
    ct += 1
    if ct < 7:
        for name2, params in child.named_parameters():
        params.requires_grad = False

Changing the last layer to fit our new_data is a bit tricky and we need to carefully check how the underlying layers are represented. We have already seen for Resnet and Inception_V3. Let’s check for other networks

Squeeze-Net

There are two variants of squeeze-net in PyTorch and we can use any of it. Unlike resnet which has an FC layer in the end, Squeeze-net final layer is wrapped inside a container(Sequential) So we need to first list all the children layer inside it and convert the required layers according to our dataset and convert into back to a container and write it back to the class. This is explained in the below code in a detailed manner.

model_conv = torchvision.models.squeezenet1_1()for name, params in model_conv.named_children():
    print(name)'''
features
classifier
'''## How many In_channels are there for the conv layer
in_ftrs = model_conv.classifier[1].in_channels## How many Out_channels are there for the conv layer
out_ftrs = model_conv.classifier[1].out_channels## Converting a sequential layer to list of layers 
features = list(model_conv.classifier.children())## Changing the conv layer to required dimension
features[1] = nn.Conv2d(in_ftrs, n_class, kernel_size,stride)## Changing the pooling layer as per the architecture output
features[3] = nn.AvgPool2d(12, stride=1)## Making a container to list all the layers
model_conv.classifier = nn.Sequential(*features)## Mentioning the number of out_put classes
model_conv.num_classes = n_class

Dense-Net

It is very similar to Resnet but the last layer is named as a classifier. Below is the code

model_conv = torchvision.models.densenet121(pretrained='imagenet')
num_ftrs = model_conv.classifier.in_features
model_conv.classifier = nn.Linear(num_ftrs, n_class)

VGG and Alex-Net

It is similar to Squeeze-net. The last FC layers are wrapped inside a container, so we need to read that container and change the last FC layer as per our dataset requirements.

model_conv = torchvision.models.vgg19(pretrained='imagenet')# Number of filters in the bottleneck layer
num_ftrs = model_conv.classifier[6].in_features# convert all the layers to list and remove the last one
features = list(model_conv.classifier.children())[:-1]## Add the last layer based on the num of classes in our dataset
features.extend([nn.Linear(num_ftrs, n_class)])## convert it into container and add it to our model class.
model_conv.classifier = nn.Sequential(*features)

We have seen how to freeze required layers and change the last layer for different networks. Now, let’s train the network using one of the nets. I am not going to mention this here in detail as it is already made available in my Github repo.

Base code

Like any deep learning model, We need to first

Define a network
Load pre-trained weights if available
Freeze the layers which you don’t want to train (frozen layers act as feature vector extractor)
Mention the loss
Choose the optimizer for training
Train the network until your defined criteria are met.

Now let’s look at how this done for inception_v3 in PyTorch. We will be freezing first few layers and train the network using an SGD optimizer with momentum and use Cross-Entropy loss.

Dataset Used

Plant Seedlings Classification

Determine the species of a seedling from an image

www.kaggle.com

Metrics:

I have gone a bit further to check how these models perform in a different setting. My idea was to train all the networks and see how individual models work and later apply different ensembling methods to improve accuracy. I also thought of knowing how diverse are these models. So here are the metrics on the Train and validation dataset.