Transfer Learning From Scratch Using Keras

Rohit Thakur
5 min readAug 6, 2019

--

Transfer Learning (Image Credit : https://data-flair.training/)

Transfer Learning Implemented In Keras On VGG16

Transfer learning is the concept in deep learning in which we take an existing model which is trained on far more data and use the features that the model learned from that data and use it for our problem. Since that model has learned from a lot of data so that model has been trained quite well to find some features. We can use those features and by tweaking some part of that trained model use it for our use case. In transfer learning instead of training all the layers of the model we lock some of the layers and use those trained weights in the locked layers to extract particular features from our data. We don’t need to lock all the layers we can choose to retrain some of the lower layers because those lower layers will be specialised for our data.

I have done the Dogs vs Cat challenge on kaggle using transfer learning and got very good accuracy by running very less epochs. Kaggle kernel link can be found below.

https://www.kaggle.com/rohit1277/cat-dog-classifier-using-vgg16-transfer-learning

I am going to implement transfer learning of VGG16 from scratch in Keras. This implement will be done on Dogs vs Cats dataset. You can download the dataset from the link below.

https://www.kaggle.com/c/dogs-vs-cats/data

Once you have downloaded the images then you can proceed with the steps written below. To know more in detail about all the methods and why we are implementing them you can refer to my other article in which I am implementing VGG16 from scratch in keras. Link can be found below.

https://medium.com/@1297rohit/step-by-step-vgg16-implementation-in-keras-for-beginners-a833c686ae6c

import keras
from keras.models import Model
from keras.layers import Dense
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image

Here I will import keras and all the methods and function of keras that we will need to build our model.

trdata = ImageDataGenerator()traindata = trdata.flow_from_directory(directory="../train",target_size=(224,224))tsdata = ImageDataGenerator()
testdata = tsdata.flow_from_directory(directory="../test", target_size=(224,224))

Here using the ImageDataGenerator method in keras I will import all the images of cat and dog in the model. ImageDataGenerator will automatically label the data and map all the labels to its specific data.

from keras.applications.vgg16 import VGG16
vggmodel = VGG16(weights='imagenet', include_top=True)

Here in this part I will import VGG16 from keras with pre-trained weights which was trained on imagenet. Here as you can see that include top parameter is set to true. This means that weights for our whole model will be downloaded. If this is set to false then the pre-trained weights will only be downloaded for convolution layers and no weights will be downloaded for dense layers.

vggmodel.summary()

Now as I run vggmodel.summary() then the summary of the whole VGG model which was downloaded will be printed. Its output is attached below.

Summary of downloaded VGG16 model
for layers in (vggmodel.layers)[:19]:
print(layers)
layers.trainable = False

After the model has been downloaded then I need to use this model for my problem statement which is to detect cats and dogs. So here I will set that I will not be training the weights of the first 19 layers and use it as it is. Therefore i am setting the trainable parameter to False for first 19 layers.

X= vggmodel.layers[-2].output
predictions = Dense(2, activation="softmax")(X)
model_final = Model(input = vggmodel.input, output = predictions)

Since my problem is to detect cats and dogs and it has two classes so the last dense layer of my model should be a 2 unit softmax dense layer. Here I am taking the second last layer of the model which is dense layer with 4096 units and adding a dense softmax layer of 2 units in the end. In this way I will remove the last layer of the VGG16 model which is made to predict 1000 classes.

model_final.compile(loss = "categorical_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"])

Now I will compile my new model. I will set the learning rate of SGD (Stochastic Gradient Descent) optimiser using lr parameter and since i have a 2 unit dense layer in the end so i will be using categorical_crossentropy as loss since the output of the model is categorical.

model_final.summary()

Now if I print the summary of the model you can see that the last softmax dense layer of 1000 units has been replaced by the new softmax dense unit of 2 layers. The output is attached below.

Output of new VGG16 model
from keras.callbacks import ModelCheckpoint, EarlyStoppingcheckpoint = ModelCheckpoint("vgg16_1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)early = EarlyStopping(monitor='val_acc', min_delta=0, patience=40, verbose=1, mode='auto')model_final.fit_generator(generator= traindata, steps_per_epoch= 2, epochs= 100, validation_data= testdata, validation_steps=1, callbacks=[checkpoint,early])model_final.save_weights("vgg16_1.h5")

Here now I will implement EarlyStopping, ModelCheckpoint callbacks for the model and i will fit the model using fit_generator . For more details on these you can check out my other article in which I implement VGG16 from scratch in Keras link. As you run this model you will see that the model will converge more quickly when compared to if you would have trained this model from scratch.

Output of training the model

Here you can see that I was able to get full 100% accuracy on my validation data by running just 5 epoch. This would have taken lot of time to converge if I was training model from scratch. Transfer learning is a very good approach if we have less data for our problem statement. It helps us to fully utilise the open nature of the deep learning community as we can solve our problem statement by utilising the work done by some one else on very large data-set compared to ours.

This is a complete implementation of Transfer Learning using VGG16 in Keras. If you want to learn the basic of keras, implement VGG16 from scratch and learn more about all the methods used here then you can read the article written by me on step by step VGG16 implementation in Keras from this link.

https://medium.com/@1297rohit/step-by-step-vgg16-implementation-in-keras-for-beginners-a833c686ae6c

Kaggle link of kernel : https://www.kaggle.com/rohit1277/cat-dog-classifier-using-vgg16-transfer-learning

If you would like to learn step by step about Face Detection and Face Recognition from scratch then you can head over to my article on that topic on the link : https://medium.com/@1297rohit/step-by-step-face-recognition-code-implementation-from-scratch-in-python-cc95fa041120

Enjoy Transfer Learning !

--

--