Transfer Learning with Keras
In this blog we will learn what is Transfer Learning and when and how we should use it.
Transfer Learning is a research problem in deep learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem.
Idea: Instead of coding a neural network from scratch to solve our problem we can reuse existing model (VGG16).
Why Transfer Learning?
Deep Learning model often requires very large data-set to train. This leads to a problem of over-fitting when we do not have large data-set.
If we train our model with more layers then there is a very high possibility of overfitting.
To prevent this we need large dataset. A pre-trained model is a saved network that was previously trained on a large dataset, typically on a large-scale image-classification task. You either use the pre-trained model as it is, or use transfer learning to customise this model to a given task.
Image-Net dataset:
The ILSVRC is an annual computer vision competition developed upon a subset of a publicly available computer vision dataset called ImageNet. As such, the tasks and even the challenge itself is often referred to as the ImageNet Competition.
It has over 14 million data-points and over 20 thousand categories.
The ImageNet dataset is a very large collection of human annotated photographs designed by academics for developing computer vision algorithms.
When and how to apply Transfer Learning?
There are a lot of cases where we can apply Transfer learning.We will analyse them one by one.
Case 1: When our dataset is very small but similar to image-Net dataset.
Freeze all the Layers and add some costume dense Layers and train the model.
Case 2: When our dataset is large and is similar to Image-Net dataset.
Fine tune the complete network with a small learning rate.
Case 3: When dataset is medium size and similar to Image-Net dataset.
Fine tune last layers.Freeze the top layers.
Case 4: When dataset is large and not similar to Image-Net dataset.
Initialise the weights of network with the pre-trained model and retrain the whole model.
Types of pre-trained models used for transfer learning:
Since 2012 many models came up to solve Image-Net classification. Following are some models:
- Alex-Net: Alex-net is the name of a convolutional neural network designed by Alex Krizhevsky. AlexNet contained eight layers; the first five were convolutional layers, some of them followed by max polling layers, and the last three were fully connected layers. It used the non-saturating ReLU activation function, which showed improved training performance over tanh and sigmoid.
2. VGG-Net: 2 version(VGG16 and VGG19):It was simplified version of AlexNet with 3*3 kernal, stride=1 and padding= same.
3.RESNET(34,50,101,152): Residual Networks were designed to skip dead connection and decrease the effect of layers on the performance.
Applying Transfer Learning(VGG16):
We will apply transfer learning on RVL-CDIP data-set.
Data Source :https://www.cs.cmu.edu/~aharley/rvl-cdip/
There are 400,000 total document images in the dataset. Uncompressed, the dataset size is ~100GB, and comprises 16 classes of document types, with 25,000 samples per classes. Example classes include email, resume, and invoice.
Objective:
For a given image classify it into one of the 16 classes.
Step 1: Import the library
Step 2: Pre-process and load the images using Image-data generator
I am loading only some samples form RVL-CDIP dataset.
Step 3: Build the model using transfer learning (VGG16)
Step 4: Freeze the top layers and add some custom dense layers.
Step 4: Since my problem is multi class classification with 16 classes add 16 units dense layer and compile the model
Step 5:View the final Model:
We can see that out of 22 million parameters we are training only 10 million parameters.
Step 6:Train the final Model:
Conclusion:
- freeze top layers of a pre-trained VGG16 Model since dataset was medium sized.
- To prevent overfitting try Transfer learning.
- Using Transfer learning I trained very less parameters which saves a lot of time.
4. Transfer learning works best when we have dataset similar to ImageNet.
References:
https://en.wikipedia.org/wiki/Transfer_learning
If you are new to Machine learning and want to know how multi-label classification works then go through this blog.