Your Tensorflow Pocket References on Image Data Task
How can you build Deep Learning Models using Tensorflow for Image Data Task?
Before reading this article, I would like you to watch this 100 seconds of Tensorflow video. I found it is intriguing. Enjoy reading the article.
Tensorflow has been one of the most deep learning frameworks. Based on the latest survey from Stackoverflow. It is stated that 18% of people are more likely to use Tensorflow compared to Pytorch(8%) for the same comparison of deep learning framework. Hence, Spending time learning Tensorflow to build better deep learning Models is extremely beneficial. I will explain as many as possible techniques that most TensorFlow developers used when they are building deep learning models from beginner to advanced levels.
Before going to many methods in TensorFlow libraries that you can see in the documentation, It is better to know the basics concept first called Tensor.
What is Tensor ?
Tensor is the n-dimensional array that is run on GPU. Most deep learning models are based on Tensor. Hence, Converting the data into a tensor is a must before fitting into deep learning models. There are a few types of tensors namely scalar (magnitude only), vector (magnitude and direction), matrix (table of numbers), 3-Tensor (cube of numbers), and n-tensor(any number >3) as shown in the following notes.
scalar = 60
vector = [1.5, 2.6, 3.9]
matrix = [[1, 4, 6], [7, 6, 8], [2, 5, 10]]
3-tensor = [[, , ], [, , ], [, , ]]
If you have ever done matrix computation using Numpy, you will be able to learn tensors faster because they are quite similar. The difference between them is Numpy is computed using CPU while tensor is computed using GPU that will fasten our deep learning models to run faster because of the capability and some advantages of it. Hence, Tensor is extremely needed to run deep neural networks to solve any problems from regression, classification, and even solving self-driving cars problems like (image segmentation, instance segmentation, and object detection).
What is Tensorflow?
Tensorflow is an end-to-end open source platform for building deep learning models. We use Tensorflow for computing tensors and building deep learning models. We can create any type of tensors aforementioned using TensorFlow.
import tensorflow as tf
vector =tf.Variable([1.5, 2.6, 3.9])
matrix =tf.Variable([[1, 4, 6], [7, 6, 8], [2, 5, 10]])
These are a few examples of creating tensors in Tensorflow. There are various methods available like tf.constant. You can check more on the documentation. We will not go deeper on these topics because we will cover most techniques and methods for preprocessing data and building models and tuning hyperparameters to visualize the data using tensorboard.
Data come in various forms. They could be in text, images, audio, etc. All these types of forms can not be fitted into neural networks if the data are not in numbers like text which is a string type. We have to convert it into numerical form in order to fit into deep learning models. In this article, we will focus on image data. Image is basically in the form of numbers. it is a collection of pixels of an array from 0–255 either it is a grayscale image (black and white) or an RGB image.
You can see from the grayscale image that consists of numbers between 0 and 255. The lower the value, the darker the image and vice versa. An image can be divided into grayscale and RGB. Most of the images that we see in the real world is an RGB image. RGB is the abbreviation of Red, Green, and Blue. An image consists of 0–255 on each R, G, and a B part. Most deep learning practitioners called this terminology channels. There are various techniques that we have to do before fitting our data into deep learning models such as rescale the data into 0 to 1 by dividing each pixel by 255(Normalization). This technique helps neural networks learn the data faster to converge compared to data without rescaling/normalization. In TensorFlow, We can rescale the data by using the ImageDataGenerator and preprocessing instance available in the TensorFlow framework. Furthermore, it can be used for carrying out data augmentation as shown in the following code
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen_augmented = ImageDataGenerator(rescale=1/255.,rotation_range=0.2,
For data Augmentation, Tensorflow also provides one more class to augment the data as shown in the following code. You will understand further when we implement these codes to the Mosquitos on human skin dataset in different articles. So, make sure you understand how TensorFlow helps augment the image data before carrying it out into deep learning models. You can see we flipping the image horizontally, rotating the image by 20%, etc.
from tensorflow.keras.layers.experimental import preprocessing
data_augmentation = keras.Sequential([
#preprocessing.Rescaling(1./255) # keep for ResNet50V2, remove for EfficientNetB0
], name ="data_augmentation")
Before Preprocessing the images like rescaling the data, augmenting data, etc. It is better to visualize the data in order to know what kind of images we are going to model. There are 2 ways to do this either using matplotlib or TensorFlow.
from pylab import imread,subplot,imshow,show
import matplotlib.pyplot as plt
image = imread(target_image_url) // choose target folder
import tensorflow as tf
import matplotlib.pyplot as plt
Modeling in most deep learning Frameworks comprises 2 ways either implementing from scratch or carrying out transfer learning. Implementing from scratch means building deep learning models by implementing layers needed and training the data based on how many images are available. Transfer learning brings distinct ways. We benefit by using a few layers available in a specific model architecture that has been trained on a much larger dataset that can be used in our specific dataset. It could be helpful because training data using neural network need much more computer resources that can be reduced by using transfer learning.
The implementation of model architecture from scratch is shown in the following.
model = tf.keras.models.Sequential([
you can see this simple model architecture that has input image 224*224 and 3 channels in RGB to determine 10 outputs. One thing that you have to remember is to not forget to flatten the image before doing tensor calculations on the next hidden layers. In real life, There are various ways to create model architecture namely Sequential API, Functional API, and model subclassing. The code above is one of Sequential API. We will go deeper into this in another article.
The implementation of model architecture using transfer learning can be carried out in 2 ways as shown in the following
IMG_SHAPE = (224,224) + (3,)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
Using Tensorflow Hub
IMAGE_SHAPE = (224, 224)
classifier = tf.keras.Sequential([
You can see we use MobileNet version 2 to classify images. We use include_top=False to indicate we have different outputs based on our own dataset. You can tweak with different model architectures by looking at the documentation either in Transfer Learning Model Architecture or TensorFlow Hub.
In Building deep learning models, there are a few hyperparameters that we have to set in order to create a better score on the metrics we define. They are learning rate, batch size, activation functions to introduce non-linearity, and the number of epochs. A learning rate that is too big will make the steps will explode and can not converge to a global minimum while a too small learning rate will vanish gradient descents. In order to avoid this, You can implement this callback Tensorflow and implement this instance when fitting the data.
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor="val_loss",
factor=0.2, #multiply the learning rate by 0.2 (reduce by 5x)
verbose=1, # print
Batch size also plays an important part in reaching convergence. It is advised to use 32 for image data. Meaningful insights on the batch size can be checked on this tweet from one of my favorite influencers in Machine Learning(Author of Machine Learning with PyTorch and Scikit-Learn) that whether to use powers of 2 in batch size.
Epochs show the number of times whole training data in the training networks. You can see we use 5 epochs and it trained 5 times during forwarding and backward propagation to estimate the best weights based on the loss. The number of epochs depends on many factors whether we are using transfer learning or implementing it from scratch. We can use small epochs if we use transfer learning because it has learned the pattern from a larger dataset like ImageNet and uses bigger epochs when implementing a model architecture from scratch.
Tensorboard is one of my favorite callbacks. We can visualize the data based on the training we do when fitting the data. Moreover, Tensorflow provides Tensorboard Dev so that we can share the visualization we develop in the cloud in tensorboard.dev. You can visualize many things from evaluating the accuracy and loss of training and validation data, showing the summary of model architecture, interpreting patterns learned in each layer from the data, etc. I found this tool quite good. One thing to note is that you have to ascertain that the data is not private data because it can be accessed by everyone else. You can implement tensorboard callback and upload the visualization as shown in the following code
import tensorflow as tf
import datetimelog_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
Upload to Tensorboard.dev
tensorboard dev upload --logdir logs \
--name "(optional) My latest experiment" \
--description "(optional) Simple comparison of several hyperparameters"
Do not forget to install tensorboard by pip install -U tensorboard.
That is all for pocket references when we are building deep learning models to solve any problems in the image dataset. Then, I will show the real-world examples of building model architecture both from scratch and using transfer learning on real data based on mosquito on human skin dataset by clicking the second article as follows
Thank you for reading!
I really appreciate it! 🤗 If you liked the post and would like to see more, consider following me. I post topics related to machine learning and deep learning. I try to keep my posts simple but precise, always providing visualization, and simulations.
Josua Naiborhu is a business development analyst who turns into a self-taught Machine Learning Engineer. His interests include statistical learning, predictive modeling, and interpretable machine learning. He loves running and it teaches him against giving up doing anything, even when implementing the Machine Learning Lifecycle(MLOps). Apart from pursuing his passion for Machine Learning, he is keen on investing in the Indonesian Stock Exchange and Cryptocurrency. He has been running a full marathon in Jakarta Marathon in 2015 and Osaka Marathon in 2019. His next dreams are to run a marathon in Boston Marathon, TCS New York City Marathon, and Virgin Money London Marathon.
You can connect with him on LinkedIn, Twitter, Github, Kaggle, or reach out to him directly on his personal website.