Transfer Learning using Keras Functional API in TensorFlow 2.0

Ayesha Shafique
6 min readDec 15, 2019

--

Source: hackernoon

From the definition

Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned.

— Chapter 11: Transfer Learning, Handbook of Research on Machine Learning Applications, 2009.

Transfer learning means to share knowledge from one network to another. So that we don’t have to our model from scratch.

In layman terms, if we are working on some project, and if there are some ready-made libraries for some of the tasks used in the project, then it is wise to use them, instead of re-inventing the wheel.

Likewise in Computer Vision or deep learning network modeling, if we are working to build a Computer Vision Classifier on the Kitchen dataset from scratch, then we would need a lot of datasets to train our model from scratch, the same goes for computational cost and the time. Then if I have some model that is trained on millions of image dataset already, we can use it for our case. Like the already available model is an Object Detection Classifier and you are aiming to build a Kitchen accessories classifier both should have more or less the same kind of feature extractor to extract the features from the images. So this is where transfer learning comes in a place to share the knowledge from one network to another network.

Source: mc.ai

We all have heard the news of the launch of TensorFlow Version 2.0. We know that Keras is a top-level library using the TensorFlow as a backend.

Before the launch of TensorFlow 2.0, when we have to use the Keras, we normally do pip install Keras. And when we have to import the Keras in my TensorFlow code, we use import Keras.

But now things are changed with TensorFlow 2.0.

Now we can pull Keras directly from TensorFlow using tensorflow.keras as TensorFlow’s native package without having to explicitly import Keras.

These are some of the concepts which we have to cover in this article.

  1. Loading dataset
  2. Splitting dataset
  3. Normalize dataset
  4. Convert the labels from Integers to Vectors
  5. Data Augmentation
  6. Create Model Architecture
  7. Model Compilation
  8. Model Training with TensorBoard callback
  9. Model Saving
  10. Model Evaluation

So let's start with a transfer learning example with deep diving each and every section.

1. Loading dataset

To start working on transfer learning first, we will do a data loading from the data source. As you can see that we have used the CIFAR 10 data available in the TensorFlow app by just calling cifar10.load_data(). This dataset contains the images for CIFAR dataset and they have 10 distinguish categories to classify.

To know more about the dataset go to the Keras official documentation: https://keras.io/datasets/

2. Splitting dataset

After data loading, we will split our dataset into training and testing datasets.

Training dataset for the model training so that it can learn the weights.

Testing dataset to evaluate the model on seen data once it is built on the training dataset using the transfer learning technique.

3. Normalize dataset

After data splitting into training and testing datasets, we will normalize the dataset so that we can scale the dataset into a range of 0 to 1.

In TensorFlow 2.0, we can do the normalization using this way,

# normalize the data into the range [0, 1]

trainX = trainX.astype(“float32”) / 255.0

testX = testX.astype(“float32”) / 255.0

4. Convert the labels from Integers to Vectors

After normalization, since our labels data is in integer form so we will convert it into vectors using LabelBinarizer().

# convert the labels from integers to vectors

label_bin = LabelBinarizer()

trainY = label_bin.fit_transform(trainY)

testY = label_bin.transform(testY)

5. Data Augmentation

As we might know, the theory of deep learning says that the more good quality of data you have for the model training, the more accurate the results you will get. So what if we have not that good amount of data, we can create transformed versions of images data on ourself or in simple words we can increase the data samples by doing data augmentation.

We can call the data augmenter on our dataset by just calling the ImageDataGenerator() on our dataset.

6. Create Model Architecture

Now, this is the most interesting part of this blog. We will go to create the model structure using transfer learning. We first initialize the input shape for the model structure. Then we extract features from the pre-trained model mobile-net so that we will not train the model from scratch.

We will use the pre-trained weights to extract features and then added our custom layers for classification because we have our own dataset and our own number of classes that are 10.

7. Model compilation

Now we have defined the architecture of our model, it is the time to train our model. But before building/training the model, remember that we have to compile the model first. Model Compile function defines the loss function, the optimizer, and the metrics.

We need a compiled model to train because training uses the loss function and the optimizer. But it’s not necessary to compile a model for predicting.

In our case, we have used SGD Schoarsuc Gradient Decent as an optimizer and categorical_crossentropy as a loss measure because it is the muli classification problem.

# initialize the optimizer compile the model and

opt = SGD(lr=LR, momentum=0.9, decay=LR / EPOCHS)

print(“[LOGGING] training network…”)

model.compile(loss=”categorical_crossentropy”, optimizer=opt,

metrics=[“accuracy”])

8. Model Training with TensorBoard Callback

As we have com[iled our model in Step #7, now we can successfully start the training of a model. For the training purpose, we have called the model.fit_generator function on training and validation dataset with the Batch size, a number of epochs and callback defined. We have used tensorboard_callback.

From the Keras documentation,

TensorBoard is a visualization tool provided with TensorFlow. This callback writes a log for TensorBoard, which allows you to visualize dynamic graphs of your training and test metrics, as well as activation histograms for the different layers in your model.

Note that here we have used fit_generator function instead of fit function on the model, because the fit() is used when we have a small dataset and we can load it into memory. On the other hand, fit_generator() is used when we have a large dataset and we only take those data samples into the account that are using in the current epoch. In fit_generator(), you don’t pass the x and y directly, instead, they come from a generator.

# train the network

model_history = model.fit_generator(

data_aug.flow(trainX, trainY, batch_size=BATCH_SIZE),

validation_data=(testX, testY),

steps_per_epoch=trainX.shape[0] // BATCH_SIZE,

epochs=EPOCHS,

callbacks=[tensorboard_callback],

verbose=1)

9. Model Saving

Now our model is trained, we can save it by calling save function on the model to save the model’s architecture, weights, and training configuration in a single file.

# Save the entire model to an HDF5 file.

model.save(‘CIFAR_model.h5’)

10. Model Evaluation

As we have done with model training, now its time to see how well the model is trained on the real dataset. For this, we have to test it on testing or unseen data and then check the Confusion matrix for the results. It can be done in Keras like this,

# evaluate the network

print(“[LOGGING] evaluating network…”)

predictions = model.predict(testX, batch_size=BATCH_SIZE)

print(classification_report(testY.argmax(axis=1),

predictions.argmax(axis=1)))

So this is all for Transfer Learning using Keras Functional API in TensorFlow 2.0 !!!

Source: https://gfycat.com/

You can find the code for this example on this Github repo.

I hope you guys would like this article that will help you to get started with Transfer learning using TensorFlow 2.0.

Hit Like button if you enjoy this article ❤.

--

--