Transfer Learning with TensorFlow

Pasindu Ukwatta
Geek Culture
Published in
6 min readJul 22, 2021
Photo by Erik Mclean from Pexels

Machine learning, deep learning, robotics, artificial intelligence are hot trending topics in the world. In the past scientists and high tech, enthusiastic people spend hours and give all their effort to simplify the lifestyle of the people with the help of those technologies and their applications. Big tech companies spend millions to improve those technologies and hire skilful people from around the world. As a result of those technologies are now people capable of doing anything that you can imagine and keep evolving every day. People can use open source trained models for their projects and set them according to their problems. We know that the amount of data is a huge factor when considering the performance of the model. So big tech companies can provide a huge amount of data to train, evaluate and test models. So models can perform well in work. So those companies did the training and test part for us. So just we need to select a suitable model and improve and add changes to the model according to our problem.

As we know IT industry keeps evolving at a rapid speed. So time has a big impact. So there is no time to build a model from scratch. But some special problem you need to create the model from scratch. Otherwise, without reinventing the wheel you can directly apply the pre-build build model and try to improve it. That is the main idea of transfer learning.

There are main three ways that can use Transfer learning. Let’s talk about each of them.

Use the model as above

https://www.tensorflow.org/site-assets/images/project-logos/tensorflow-hub-logo-social.png

In deep learning, TensorFlow is one of the most popular technologies used. In that TensorFlow hub has many prebuild models according to problems. So you can select the model according to your problem. In the TensorFlow hub, you can select models according to the text problem domain, image detection domain, audio detection problem or video problem domain. So you can walk through the documents of the model and select a suitable model. But here you need to follow the correct data preprocessing methods to the data and provide inputs to the model. Otherwise, you will get unexpected results from models. In some models, you need to normalize and scale the data before providing them to the model. For example, EfficientNet has inbuilt layers for normalization and scaler. But ResNet did not have inbuilt layers to scale. So you need to do the scaling. So you need to read the documentation of the model and should have a clear idea about the architecture of the model. You can download the model into your computer and import it or simply provide the link of the model to import it.

#Install tensorflow hub
pip install --upgrade tensorflow_hub
import tensorflow_hub as hub

model = hub.KerasLayer("https://tfhub.dev/google/nnlm-en-dim128/2")
embeddings = model(["The rain in Spain.", "falls",
"mainly", "In the plain!"])

print(embeddings.shape) #(4,128)

Here I did my coding in Google Colab. So I simply use the model link to import the model. In model creation, you can use sequential or functional.

#two models urlresnet_url="https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4"
efficinet_url="https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"
#create_model fucntion
def create_model(model_url,num_classes=10):
"""
Takes tensorflow hub url and create sequnetial model with it
Args:
model_url =Tensorflow hub model url
num_clasese=Number of outputs
returns:
an uncomplied model with model_url as feature extraction
"""
#download the pre trainded modelfeature_extractor_layer =hub.KerasLayer(
model_url,
trainable =False,
name="feature_extraction_layer",
input_shape=IMAGE_SHAPE+(3,))
#create the model
model=tf.keras.Sequential([
feature_extractor_layer,
layers.Dense(num_classes,
activation="softmax",
name="output_layer")
])
return model

Feature extraction Transfer Learning and adjust the output layer for the problem

In feature extraction, you need to augment the data and try to improve the performance while changing the data. You can try to rescale, rotate, zoom (in the image classification model) to data augmentation.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
#craete data augmentation with horizontal flipping,rotation,zoomsdata_augmentation=keras.Sequential([
preprocessing.RandomFlip("horizontal"),
preprocessing.RandomRotation(0.2),
preprocessing.RandomZoom(0.2),
preprocessing.RandomHeight(0.2), preprocessing.RandomWidth(0.2)
],name="data_augmentation")

At the start, you need to start with a small amount ( 10%)of the data because the model requires time to train the model. It can be due to the hardware performance of the computer. So you need to have a clear idea about this. In transfer learning, it is acceptable to use a small amount of data because these models are already trained on data. So we can reduce the data size. Here you need to provide the input layer and output layer with the base model. At the start, you need to train the model for a small number of epochs. Because sometimes the big number of epochs can be caused to the overfitting of the model. You can identify the overfitting by looking at the loss curve of the model.

Loss curves for the Model

Fine Tune the Transfer Learning Model

In fine-tuning, you can unfreeze some layers from the base model (already build model) train them with the train data. After changing the trainable parameter to “true” of selected layers, then you need to recompile the model. Normally you can change the trainable parameter in few layers( 5–10 layers).

#Check for the layers
int(len(model.layers[2].trainable_variables))
for i, layer in enumerate( model.layers[2].layers):
print(i,layer.name,layer.trainable)
here I set the trainable parameter as “True” for last 10 layers

Then you can train the model and evaluate the model. In here some time you will lose the previous performance. So you need to create a checkpoint before fine-tuning the models. Here when you compile the model you need to use a lower learning rate than the default value(learning_rate=0.0001). Because sometimes models can move towards overfitting. Adam and SGD are normally used as optimizers. So you can choose according to your problem. You can start fitting the model with data with the help of checkpoints. So you can start with the weights that performing well. Always need to maintain the logs for every model creation.

After model creation and finalized the model you can import the checkpoint data into a tensor board to visualize and analyze them. But when importing the data into the tensor board you need to know that data is publicly available after you import them into the tensor board. So you need to think about the suitability before uploading them into the tensor board.

Conclusion

In this article, I try to give you a simple idea about transfer learning with TensorFlow. I did not put in so much coding. I think TensorFlow have good documentation with an example that you can run and experiment it. But I think at first you need to have a clear understanding of machine learning and try to create few models with few layers from scratch using neural networks before trying transfer learning. Else you will get some problems and you cannot resolve them due to a lack of knowledge in fundamental concepts. So after trying few models then you can try transfer learning. In this article, I will not talk deep about data preprocessing, callbacks, checkpoint creation and visualization. I hope to write a detailed article about those concepts in future.

In transfer learning at the start, you need to select a small amount of data. Because model training is a time-consuming task and needs a high requirement of hardware. After feature extraction and fine-tuning, you can train the model on whole data and evaluate it. If you need to roll back then you can use checkpoints.

I think that learning deep learning is not an easy task at the beginning. But with my personal experience, I started as a beginner and with time, practice and effort you can slowly improve your knowledge. So you need to do self-study and put some effort and commitment into it. Nowadays you can refer internet for further studies. I also learn these things by reading articles, watching videos on youtube, and following videos on Udemy(Daniel Bourke/Andrei Neagoie).

This article has explored ways to work with Transfer Learning with TensorFlow I hope will assist you in completing your work more accurately. I’d like to thank you for reading my article, I hope to write more articles on new trending topics in the future to keep an eye on my account if you liked what you read today!

References:

--

--

Pasindu Ukwatta
Geek Culture

SE @DirectFnSL ,Graduate of the University of Moratuwa 👨‍🎓, Faculty of IT 👨‍💻, likes Python 🐍| React 🔯| JavaScript 🎇| Java ☕| ML😎 | DL 🤖