Just transfer it! — An intro to Transfer Learning.

Eshika Shah
IET-VIT
Published in
5 min readSep 1, 2020

--

Schematic diagram of Transfer Learning.

Humans have a very unique ability to learn while they carry out their day-to-day tasks. They have a tendency of forming a logic from their gained knowledge which they use while performing a different set of tasks. Well did you know computers could do that too? Come along I’ll show you how this trick works which goes by the name of Transfer learning.

Let’s get started with transfer learning!

To understand better let’s ask ourselves some questions:

What is Transfer learning?

· Making use of the knowledge gained while solving one problem and applying it to a different but related problem.

· For example, knowledge gained while learning to recognize cars can be used to some extent to recognize trucks.

· Using pre-trained models and changing some layers to classify new objects.

Why do we use transfer learning?

Training models may sometime take weeks even on multiple GPUs so why not just save ourselves some time when we have tools like transfer learning. There is a reason why we are living in the 21’st Century right! In order to get a better performance (in most cases), and not needing a lot of data.

How to use it?

In order to seek an answer to this question we need to understand the 4 realms of Transfer Learning.

These are four realms:

Case 1: New data set is small, new data is similar to original training data.

Case 2: New data set is small, new data is different from original training data.

Case 3: New data set is large, new data is similar to original training data.

Case 4: New data set is large, new data is different from original training

Here’s what we do:

Case 1: We slice off the end layers of the neural net, add our own fully-connected layers and just train the added layers.

Case 2: We slice off all but some layers from the beginning of the net, add few fully connected layers in the end and train those.

Case 3: In this case we do the same steps as in Case 1 but sometimes we re-initialize the weights of the entire neural network for better learning.

Case 4: Usually in this case we may restart the training from scratch by randomly initializing the weights.

So now we know enough to start writing the first set of codes of our first ever transfer learning model with the help of Pytorch!

Importing the basic libraries

So, lets import all the required libraries which we will be using in our journey.

import os
import numpy as np
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
%matplotlib inline

Next, we load the data directories. Here I have my data stored in flower_photos/ and the train and test data in train/ and test/ directories respectively.

data_dir = 'flower_photos/'
train_dir = os.path.join(data_dir, 'train/')
test_dir = os.path.join(data_dir, 'test/')

My dataset consists of 5 classes so let’s define them.

classes = ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

Now VGG-16 takes 224x224 sized input images so we resize them all.

data_transform = transforms.Compose([transforms.RandomResizedCrop(224),transforms.ToTensor()])
train_data = datasets.ImageFolder(train_dir,transform=data_transform)
test_data = datasets.ImageFolder(test_dir, transform=data_transform)

Now we create data loaders with a batch size of 20 and we train the data batch wise

train_loader = torch.utils.data.DataLoader(train_data,batch_size=batch_size,num_workers=num_workers, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data,batch_size=batch_size,num_workers=num_workers, shuffle=True)

Let’s take a look at the dataset

So, let’s load our pretrained model. Here we will be using VGG-16. VGG-16 has 13 convolution layers and three fully connected layers.

vgg16 = models.vgg16(pretrained=True)

Since our dataset is small, which is about 4000 pictures we will only replace the last layer with a new fully connected layer and only train the new layer. For carrying out the above process we need to freeze all the other weights in the net.

for param in vgg16.features.parameters():
param.requires_grad = False

And now let’s build a new layer in order to replace the last one.

n_inputs = vgg16.classifier[6].in_features
last_layer = nn.Linear(n_inputs, len(classes))
vgg16.classifier[6] = last_layer

Now for the criterion (loss function) and optimizer we will us Cross Entropy Loss and stochastic gradient descent with a small learning rate respectively.

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(vgg16.classifier.parameters(), lr=0.001)

So, ready for the climax?

We are done creating the model now it’s time to train it. Here I am training it for 2 epochs you can do the same for more.

# number of epochs to train the model
n_epochs = 2
for epoch in range(1, n_epochs+1):
# keep track of training and validation loss
train_loss = 0.0
for batch_i, (data, target) in enumerate(train_loader):
# move tensors to GPU if CUDA is available
if train_on_gpu:
data, target = data.cuda(), target.cuda()
# clear the gradients of all optimized variables
optimizer.zero_grad()
# forward pass: compute predicted outputs by passing inputs to the model
output = vgg16(data)
# calculate the batch loss
loss = criterion(output, target)
# backward pass: compute gradient of the loss with respect to model parameters
loss.backward()
# perform a single optimization step (parameter update)
optimizer.step()
# update training loss
train_loss += loss.item()

if batch_i % 20 == 19: # print training loss every specified number of mini-batches
print('Epoch %d, Batch %d loss: %.16f' %
(epoch, batch_i + 1, train_loss / 20))
train_loss = 0.0

You can test the model for its accuracy, the complete code for which is provided in my github repository: https://github.com/EshikaShah/Transfer-Learning

Let’s visualize the results!

We can see that the results are pretty good.

Conclusion

Congratulations I have successfully transferred my knowledge to y’all. We can improve the accuracy further by increasing the number of epochs. Now go ahead and apply this knowledge to make some amazing models!

Transfer Learning is the new frontier for ML

~ Written by Eshika Shah

--

--