“A Beginner’s Guide to Image Classification with Keras and TensorFlow”

--

Computer Vision Project— 001

Image classification is a fundamental task in computer vision that involves assigning a label or category to an image based on its content. In this article, we will explore how to perform image classification using Keras and TensorFlow, two popular libraries in the field of deep learning. We will walk through the process step by step, starting from data preparation to model building, training, and evaluation. We will be using a pretrained model from keras application since this is not a research purpose.

Setting up colab

We will be using Google Colab and will connect to GPU runtime instead of CPU. I have used T4GPU.

Since the Keras can be accessed right from TensorFlow, let’s first check them in colab and it’s version.

!nvidia-smi  # Checking GPU
import tensorflow as tf
tf.__version__
import keras

This code checks whether a GPU is available and then imports necessary libraries like TensorFlow and Keras.

1. Data Preparation:

Before diving into model building, we need to prepare our dataset. This typically involves collecting and pre processing images. In our case, we’ll be working with a dataset containing images of cats and dogs. We’ll organize our data into training and validation sets and resize the images to a standard size.

Here we are using the Kaggle dataset Cats and Dogs Image Classification Dataset. The Dataset folder architecture should be like the image below.

After arranging the dataset in the architecture, upload the Data folder into the G-drive and mount the drive in colab.

Now If we checked the present working directory, it should be shown as /content. So we will change the directory into our dataset.

%pwd                              #/content
%cd /content/drive/MyDrive/Data #/content/drive/MyDrive/Data
%pwd #/content/drive/MyDrive/Data
%ls

!unzip Data.zip #unzip the folder
%ls #train/ validation/

Importing Libraries and Modules

#importing libraries and packages
from tensorflow.keras.layers import Input, Lambda, Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.keras.models import Sequential
import numpy as np
from glob import glob

Defining Image Size and Paths:

image from https://keras.io/api/applications/

We will be using pretrained model VGG16 in this practice.The image size is defined as [224, 224], which is a common input size for many pre-trained models. Paths for training and validation data are also specified.

IMAGE_SIZE = [224, 224]

train_path = "train"
valid_path = "validation"
vgg16 = VGG16(input_shape=IMAGE_SIZE + [3], weights="imagenet", include_top=False)

VGG16 is loaded with pre-trained ImageNet weights, excluding the top layer because VGG16 architecture has a dense layer but we are going to add our custom dense layer.

2. Model Building:

Freezing Layers of VGG16:

All layers of the VGG16 model are set to non-trainable to prevent their weights from being updated during training.

for layer in vgg16.layers:
layer.trainable = False
for layer in vgg16.layers:
print(layer.name, layer.trainable)

Now let’s check the summary of the model

vgg16.summary()

Adding Custom Dense Layers:

Custom dense layers are added on top of the VGG16 base.

folder=glob("train/*")    
folder #['train/cats', 'train/dogs']
len(folder) #2
model = Sequential()
model.add(vgg16)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.summary()             #Trainable params: 6423298 (24.50 MB)

Model Compilation:

The model is compiled with a loss function, optimizer, and evaluation metrics.

model.compile(
loss="categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"]
)

Image Data Augmentation:

ImageDataGenerator is used to perform real-time data augmentation on the training images.

train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True
)

test_datagen = ImageDataGenerator(rescale=1./255)

Loading Training and Validation Data:

Training and validation data are loaded using flow_from_directory method, with resizing and batch size specified.

training_set = train_datagen.flow_from_directory(
"train",
target_size=(224, 224),
batch_size=32,
class_mode="categorical"
)
test_set = test_datagen.flow_from_directory(
"validation",
target_size=(224, 224),
batch_size=32,
class_mode="categorical"
)

For image classification, we are using a pre-trained VGG16 model as the base and adding custom dense layers on top. VGG16 is a deep convolutional neural network with proven performance on various image classification tasks.

Photo by Alec Favale on Unsplash

Model Training:

With the compiled model, we can start the training process. We’ll train the model using the training dataset and evaluate its performance on the validation dataset.

history = model.fit(
training_set,
validation_data=test_set,
epochs=50,
steps_per_epoch=len(training_set),
validation_steps=len(test_set)
)

Evaluation and Visualization:

After training, we can visualize the training history to analyze the model’s performance in terms of loss and accuracy over epochs.

import matplotlib.pyplot as plt
plt.plot(history.history["loss"], label="train loss")
plt.plot(history.history["val_loss"], label="val loss")
plt.legend()
plt.show()
plt.savefig("Loss graph")


plt.plot(history.history["accuracy"], label="train accuracy")
plt.plot(history.history["val_accuracy"], label="val accuracy")
plt.legend()
plt.show()
plt.savefig("accuracy graph")
loss graph
accuracy graph

Saving and Loading the Model:

After training the model, it’s crucial to save it so that it can be reused later without having to retrain from scratch. We can save the model using the save method and load it using the load_model function.

# Save the trained model
model.save("model_vgg16.h5")

# Load the saved model
from tensorflow.keras.models import load_model
model = load_model("model_vgg16.h5")

Making Predictions on New Images:

Once the model is loaded, we can use it to make predictions on new, unseen images. Here’s how you can load a test image, pre-process it, and make a prediction.

from tensorflow.keras.preprocessing import image
import numpy as np

# Load and preprocess the test image
image_name = "cat_399.jpg"
img = image.load_img(image_name, target_size=(224, 224))
x = image.img_to_array(img)
x = x / 255 # Normalize the image
x = np.expand_dims(x, axis=0) # Add batch dimension

# Make prediction
prediction = model.predict(x)
predicted_class = np.argmax(prediction)

# Display the prediction
if predicted_class == 1:
print("Predicted: Dog")
else:
print("Predicted: Cat")

Conclusion: In this article, we have covered the basics of image classification using Keras and TensorFlow. We started with data preparation, followed by model building, training, and evaluation. By leveraging pre-trained models like VGG16 and the powerful APIs provided by Keras and TensorFlow, even beginners can build and train sophisticated image classification models. With further exploration and experimentation, you can extend this knowledge to tackle more complex image recognition tasks. Happy coding!

--

--

DIY Coding (Do It Yourself) by Arsha

Welcome to DIY Coding! 🧠 With AI and Data this is your guide to DIY Coding 📊 Data mysteries, building intelligent systems is what makes me curious