Image Classification with TensorFlow

Düzgün İlaslan
10 min readJul 23, 2023

--

The aim of our article is to reinforce the subject by making an application.

Before moving on to the implementation part, let’s focus on the definitions. If you are familiar with these parts, you can skip them. If you do not know about these parts, it will be a very helpful part for you.

What is the TensorFlow ?

Whether you’re working on machine learning or an AI enthusiast, you must have heard of TensorFlow. TensorFlow is among the most popular solutions for machine learning and deep learning work.

TensorFlow is an open source library for deep learning. The Google Brain Team originally created TensorFlow to do big calculations. So it wasn’t built specifically for deep learning. However, it soon became clear that TensorFlow was useful for deep learning applications, and TensorFlow has since been made an open source solution.

link

TensorFlow is basically a low-level toolkit for doing complex math operations and is aimed at researchers who know what they’re doing to build, play with, and translate experiential learning architectures into working software. With TensorFlow you can easily train and run deep neural networks for various ML applications. These include word embeddings, handwritten digit classification, recurrent neural networks, image recognition, natural language processing, and partial differential equation simulations.

Of course, TensorFlow cannot be explained so briefly, but it will be enough for our article.

What is Image Classification ?

A method applied to find the class to which that pixel belongs as a result of a mathematical operation determined based on the values in different channels (bands) for each pixel, which is generally based on different spectral reflections of objects on images for detection purposes. These categorized data can then be used to produce thematic maps of the land cover contained in an image.

link

NOTE: I will talk about CNN in another article in the future.

Convolutional Neural Network is the most commonly used method for image classification. We will also refer to CNN in our application.

Now let’s look at Tensorflow’s Image Classification example.

Application Part

Our aim here will be the classification of flowers. We will use a previously prepared data set.

The determined flow will be as follows;

1- First examine and understand the dataset
2- Build an input pipeline
3- Create the model
4- Train the model
5- Test the model
6- Refine the model and repeat the process

Import TensorFlow and other necessary libraries:

import matplotlib.pyplot as plt
import numpy as np
import PIL
#Tensorflow libs
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

1- First examine and understand the dataset

Used link in here: https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz

import pathlib
#Declare url
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
#Create a file and download it with Keras utils
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
# get dataset path
data_dir = pathlib.Path(data_dir)

There will be five file in “flower_photo”

flower_photo/
daisy/
dandelion/
roses/
sunflowers/
tulips/

if we want to look at the number of images downloaded;

#get image count
image_count = len(list(data_dir.glob('*/*.jpg')))
print("Total Images:",image_count)
>> Total Images: 3670

Now how random rose image

#get all images in rose file 
roses = list(data_dir.glob('roses/*'))
# show third image
PIL.Image.open(str(roses[3]))

Now how random tulips image

#get all images in tulips file
tulips = list(data_dir.glob('tulips/*'))
PIL.Image.open(str(tulips[3]))

Load and prepare dataset using Keras

We will use tf.keras.utils.image_dataset_from_directory this utils for loading datset from directory. As you know, we have just downloaded the dataset to its own directory.

Okey, define the some parameter for loader

# Batch size
batch_size = 32
# image height
img_height = 180
# image width
img_width = 180

Use 80% of the images for training and 20% for validation.

We will create two separate sets for Train and Validation.

For train:

# split dataset for train
train_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)


>>Found 3670 files belonging to 5 classes.
>>Using 2936 files for training.

For Validation:

# split dataset for validation
val_ds = tf.keras.utils.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)

>> Found 3670 files belonging to 5 classes.
>> Using 734 files for validation.

As can be seen, 2936 images were reserved for validation for 734 images for the train.

If we want to see class tunes. We will see them in alphabetical order.

#get class names
class_names = train_ds.class_names
print(class_names)

>> ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips']

After performing the dataset separation operations, now we can visualize and take part in our dataset.

# determine figure size as 10 by 10
plt.figure(figsize=(10, 10))
# take image from train_ds
for images, labels in train_ds.take(2):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
# plot image using imshow
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(class_names[labels[i]])
plt.axis("off")

now check the images shapes

for image_batch, labels_batch in train_ds:
print(image_batch.shape)
print(labels_batch.shape)
break

>> (32, 180, 180, 3)
>> (32,)

The image_batch is a tensor of the shape (32, 180, 180, 3). This is a batch of 32 images of shape 180x180x3 (the last dimension refers to color channels RGB). The label_batch is a tensor of the shape (32,), these are corresponding labels to the 32 images.

this part is important, now there are two different ways to ensure all images are buffered before model training; so you can yield data from disk without having I/O become blocking. These are two important methods you should use when loading data:

  • Dataset.cache keeps the images in memory after they're loaded off disk during the first epoch. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache.
  • Dataset.prefetch overlaps data preprocessing and model execution while training.
#Create autotune object
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

Standardize the image

RGB channel values are in the range of [0, 255]. This is not ideal for a neural network and can be tiring, in general you should try to make your input values quite small so that it is easy to handle for a neural network.
based on this we will standardize the values to be in the range [0, 1] using tf.keras.layers:

# Create a normalization layer
normalization_layer = layers.Rescaling(1./255)
#normalize dataser using with norm. layer
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
# sparete as image and label
image_batch, labels_batch = next(iter(normalized_ds))
# check the first image max and min value
first_image = image_batch[0]
# Notice the pixel values are now in `[0,1]`.
print("minumum value:",np.min(first_image), "maximum values: ",np.max(first_image))

>> minumum value: 0.0 maximum values: 0.99669564

Create a Model:

Now the model will actually consist of a simple CNN model to build.

link

The Keras Sequential model consists of three convolution blocks (tf.keras.layers.Conv2D) with a max pooling layer (tf.keras.layers.MaxPooling2D) in each of them. There's a fully-connected layer (tf.keras.layers.Dense) with 128 units on top of it that is activated by a ReLU activation function ('relu'). This model has not been tuned for high accuracy; the goal of this tutorial is to show a standard approach.

# get class names
num_classes = len(class_names)

# build model layers
model = Sequential([
layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])

Compile the model

For this tutorial, we will use tf.keras.optimizers.Adam as optimizer and tf.keras.losses.SparseCategoricalCrossentropy for loss function. To view training and validation accuracy for each training epoch, pass the metrics argument to Model.compile.

#compile the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])

Check the model summary

model.summary()

input shape of the model is 180x180 and 3 channel. Total prameters of model is 3,989,285

Okey, now we can train the model

Train the model for 10 epochs with the Keras Model.fit method:

epochs=10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)

Now visualize the result of the model

Create plots of the loss and accuracy on the training and validation sets:

#get train accuracy in history
acc = history.history['accuracy']
#get validation accuracy
val_acc = history.history['val_accuracy']
# get train loss
loss = history.history['loss']
#get validation loss
val_loss = history.history['val_loss']

epochs_range = range(epochs)
# plot accuracy
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
# plot loss
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

While the training accuracy is around 91%, the validation rate is around 51%.

Here we should understand that; Building the model and dealing with data only does not give us the best model.

Let’s increase the model accuracy

How ?

Data Augmentation

Data augmentation can be called a technique of artificially augmenting the training set by creating modified copies of a data set using existing data. It involves making minor changes to the dataset or using deep learning to generate new data points. This helps expose the model to more aspects of the data and generalize better.

You will implement data augmentation using the following Keras preprocessing layers: tf.keras.layers.RandomFlip, tf.keras.layers.RandomRotation, and tf.keras.layers.RandomZoom. These can be included inside your model like other layers, and run on the GPU.

# create augmentation sequentions
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
]
)

let’s check the augmented sequentions, how it produce new image

plt.figure(figsize=(10, 10))
#take second image in train dataset
for images, _ in train_ds.take(2):
for i in range(9):
#implement augmentation
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")

As you can see, new pictures were created by processing the picture in the order we determined.

Now, Acquaintance Dropout

It was first introduced in 2014 in an article called Dropout: A Simple Way to Prevent Neural Networks from Overfitting. In summary, it is the technique of removing certain nodes from the hidden or input layer according to a certain rule (using a threshold value or randomly). For example, as can be seen in the figure below, when dropout is applied in a network model where the threshold value (treshold) is set to 0.5, the number of nodes in the hidden layer where dropout is applied is halved in the next layer.

link

When you apply dropout to a layer, it randomly drops out (by setting the activation to zero) a number of output units from the layer during the training process. Dropout takes a fractional number as its input value, in the form such as 0.1, 0.2, 0.4, etc. This means dropping out 10%, 20% or 40% of the output units randomly from the applied layer.

Create a new neural network with tf.keras.layers.Dropout before training it using the augmented images:

model = Sequential([
data_augmentation,
layers.Rescaling(1./255),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Dropout(0.2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes, name="outputs")
])

Again Compile and train the model

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.summary()
epochs = 15
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)

Model results;

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

After using the data augmentation and Dropout methods, our model has a more acceptable success rate. Methods like these help improve your model.

Let’s get to the important part; Let’s look at the prediction of the model we developed and test it.

#image url
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"
#dowload image
sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)
#load image
img = tf.keras.utils.load_img(
sunflower_path, target_size=(img_height, img_width)
)
# image to array
img_array = tf.keras.utils.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch

#make prediction
predictions = model.predict(img_array)
#get score
score = tf.nn.softmax(predictions[0])

print(
"This image most likely belongs to {} with a {:.2f} percent confidence."
.format(class_names[np.argmax(score)], 100 * np.max(score))
)

As you can see it is that simple and easy.
We developed a model in a very short time, then applied methods to improve it, and finally, we tested the model by making the model’s predictions.

Sources

https://www.tensorflow.org/tutorials/images/classification

--

--