Transfer Learning with Keras

Published in

Analytics Vidhya

4 min readSep 17, 2019

In this blog we will learn what is Transfer Learning and when and how we should use it.

Transfer Learning is a research problem in deep learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem.

Idea: Instead of coding a neural network from scratch to solve our problem we can reuse existing model (VGG16).

Why Transfer Learning?

Deep Learning model often requires very large data-set to train. This leads to a problem of over-fitting when we do not have large data-set.

If we train our model with more layers then there is a very high possibility of overfitting.

To prevent this we need large dataset. A pre-trained model is a saved network that was previously trained on a large dataset, typically on a large-scale image-classification task. You either use the pre-trained model as it is, or use transfer learning to customise this model to a given task.

Image-Net dataset:

The ILSVRC is an annual computer vision competition developed upon a subset of a publicly available computer vision dataset called ImageNet. As such, the tasks and even the challenge itself is often referred to as the ImageNet Competition.

It has over 14 million data-points and over 20 thousand categories.

The ImageNet dataset is a very large collection of human annotated photographs designed by academics for developing computer vision algorithms.

When and how to apply Transfer Learning?

There are a lot of cases where we can apply Transfer learning.We will analyse them one by one.

Case 1: When our dataset is very small but similar to image-Net dataset.

Freeze all the Layers and add some costume dense Layers and train the model.

Case 2: When our dataset is large and is similar to Image-Net dataset.

Fine tune the complete network with a small learning rate.

Case 3: When dataset is medium size and similar to Image-Net dataset.

Fine tune last layers.Freeze the top layers.

Case 4: When dataset is large and not similar to Image-Net dataset.

Initialise the weights of network with the pre-trained model and retrain the whole model.

Types of pre-trained models used for transfer learning:

Since 2012 many models came up to solve Image-Net classification. Following are some models:

Alex-Net: Alex-net is the name of a convolutional neural network designed by Alex Krizhevsky. AlexNet contained eight layers; the first five were convolutional layers, some of them followed by max polling layers, and the last three were fully connected layers. It used the non-saturating ReLU activation function, which showed improved training performance over tanh and sigmoid.

2. VGG-Net: 2 version(VGG16 and VGG19):It was simplified version of AlexNet with 3*3 kernal, stride=1 and padding= same.

3.RESNET(34,50,101,152): Residual Networks were designed to skip dead connection and decrease the effect of layers on the performance.

Applying Transfer Learning(VGG16):

We will apply transfer learning on RVL-CDIP data-set.

Data Source :https://www.cs.cmu.edu/~aharley/rvl-cdip/

There are 400,000 total document images in the dataset. Uncompressed, the dataset size is ~100GB, and comprises 16 classes of document types, with 25,000 samples per classes. Example classes include email, resume, and invoice.

Objective:

For a given image classify it into one of the 16 classes.

Step 1: Import the library

Step 2: Pre-process and load the images using Image-data generator

I am loading only some samples form RVL-CDIP dataset.

Step 3: Build the model using transfer learning (VGG16)

Step 4: Freeze the top layers and add some custom dense layers.

Step 4: Since my problem is multi class classification with 16 classes add 16 units dense layer and compile the model

Step 5:View the final Model:

We can see that out of 22 million parameters we are training only 10 million parameters.

Step 6:Train the final Model:

Conclusion:

freeze top layers of a pre-trained VGG16 Model since dataset was medium sized.
To prevent overfitting try Transfer learning.
Using Transfer learning I trained very less parameters which saves a lot of time.

4. Transfer learning works best when we have dataset similar to ImageNet.

References:

https://en.wikipedia.org/wiki/Transfer_learning

RVL-CDIP Dataset

Adam W. Harley, Alex Ufkes, and Konstantinos G. Derpanis The RVL-CDIP (Ryerson Vision Lab Complex Document Information…

www.cs.cmu.edu

If you are new to Machine learning and want to know how multi-label classification works then go through this blog.

IMDB MOVIE GENRE TAG PREDICTION

In this blog we will know how to do multi-label classification using both classical and deep learning techniques.

medium.com