CNN, John’s Life Saver! (Kitchenware Image Classification Project)

20 min readJan 25, 2023

Kitchenware Image Classification Project Using Convolutional Neural Networks

Imagine a world where nature has decreed that no creature shall live by water and food! In this world, the essence of water and food has no significance. No more screaming “Judy” for food. Perhaps we wouldn’t bother stressing those astute professors to add the word “hunger” to the Oxford dictionary. Every being lives life chilling outside the confines of restaurants and bars. We save a finger for Chef Brown when there’s no need to cut onions. The anarchy that comes with the life of water and food is buried. This saves our ears from the noise of “I’m destined to be a Cook”. Everyone lives peacefully outside the pandemonium of aroma. These can only live in our imagination. And imagination birthed the world of art.

Back to the life the creature has bestowed upon us, outside the seconds spent in the spiritual realm. Life where activities surrounding hunger and chilling rake in trillions of dollars. One such activity evolved into sub-activities in which each integral worked on its own project. Follow me as I elevate your vision to one of the integral part I executed.

Project Introduction (Here, I depicted John’s situation that spurs innovation)

You might be wondering how John strolls elegantly into the picture of a Data Science project. Rightly so, it’s a thing to wonder! Permit me to disabuse your mind and put things in the right perspective. The picture John comes to represent is a fictional moment. I visioned a situation that led to innovation. This situation represents John, in marriage to a Data Science project I intend to explain to you. Let’s proceed to have a grasp of John’s terse history. A metaverse's world at work!

The man, John.

John is a young man who, while growing up never lifted a finger to do anything “house chores." Everything was being taken care of by his family domestic workers. These actions by his family made him “NOT YOUR GUY” when it comes to house chores. It’s that bad that John is poor in identifying kitchen items. Is this how John wants to live for the rest of his life? Will there be any available technology that would fill the gap when John decides to start living life ALONE? Those questions beg for answers.

While growing up, people had this “NOT SO GOOD” perception of John. They concluded that nothing would be added to his name as his personal achievement aside from his family inheritance. Even his parents had written him off. Living his care free life, these misgiving perceptions transmitted to his ears. At first, he thought this amounted to nothing. But, as time goes on, actions by the people surrounding him recalibrated his life. He vowed to disassociate himself from things handed to him by his parents. Thus, changing the cause of his life.

Many years later, after much hassle, John is an achiever. “One thing you cannot take away from us as human beings is our resolve to show affluence of being successful." John is no exception. He decided to buy a spacious house. Show his achievements to the people who had written him off. His first conquered minds would be his parents.

John resolved to buy his house in a community of affluents. “You know buying a House can be classified as a fulfillment for the hardest worker, depending on the person’s hierarchy of needs. You play around the house from a custom size bedroom to a nicely furnished living room, to a nice well-tiled bathroom, and all add-ons that come with the house. You take it upon yourself to enjoy the interior design of the house. You are pleased with the serene environment, enjoying the exterior beautification surrounding the house.” These, cumulatively, were the first experience of John.

After relieving himself of the amusement of the house, what next?

It clicked that he needed to fill his round empty ball (stomach) with a good meal. Ideally, he would have ordered food. But he couldn’t because of his environment. He entered the kitchen to fulfill the promise to your stomach. Then the realization of his upbringing sets in. He couldn’t understand what each material/item in the kitchen was meant for. His hunger has entered another phase. Who’ll come to John’s rescue? What’s the next action he has to take?

Follow me, as I fill you in on his benefactor (a problem solver).

John and DataTalks

In the absence of no problem, it guarantees no solution (which is rare). In this case, there is a problem to be solved. John on his part had to surf the internet to find a solution to his predicament. Then, he stumbled upon DataTalks.

DataTalks is data community for Data Scientist and Machine Learning Expert. They are problem solvers. They see pregnant data, and birth solution from the data foetus.

Prior to John’s situation, DataTalk had formulated a competition among its experts to design a product whose mandate would be to identify Kitchenware. The aftermath of this competition would mean a solution to John’s predicament. All John need do, is to take a picture of any kitchenware and upload it, then the algorithm designed by the expert identifies the item and come up with a name.

I happen to be one of the contributors to this project by DataTalks. The following captures the steps I took in building the algorithm that will serve as the backbone of the product intended by DataTalks

Read more about the competition here….

Project Statement

This project seeks to build a model that will classify images of different kitchenware items into six classes:
`cups`, `glasses`, `plates`, `spoons`, `forks`, and `knives`

In this project, I intend building an algorithm with high precision accuracy in identifying any of the above classes of kitchenware.

Project Data Source

The data (sample images) were sourced using Toloka. Read more about Toloka here. Project data can be assessed from here.

Project Dataset Overview

This dataset contains images of different kitchenware

train.csv — the training set (Image IDs and classes)
test.csv — the performance of the model will be evaluated on this test file. It contains only image IDs. The model predicts the classes of the images using the provided image IDs and image jpegs. The resulting output will be contained in a csv as a submission file, which will be submitted for evaluation and model rating among the rest of the competitors.
images— the sample images the model will be trained and tested on in the JPEG format.

Model Training materials

Since the project problem focuses more on deep learning, I’ll be using the following learning resources:

- TensorFlow
- Keras — this is built on top of TensorFlow. It helps us train and use neural networks. This is importance for this project.

Project Workflow

Since this project use-case is image classes, for effective prediction, based on history project recommendations, Convolutional Neural Networks will be trained. If after investigating the performance and knowledge of a pre-trained model on this project use-case and we discover that the model can as well do well for project, thus focusing more on Transfer Learning.

Brief Background on CNN

In Deep Learning tracks, there are several aspects segmented as a GO-TO, based on the project's peculiar use-case. CNNs is one aspect of them. Convolutional Neural Networks are the type of neural networks that are used for image classification mostly. It consist of two different types of layers boxed as knowledge tracks during it’s information gathering processes. One is Convolutional Layers, while the other is Dense layers.

On this project, I’ll be focusing more on the Dense layers, since I intend to build on the extracted Convolutional Layers from a pre-built application (model) in Keras. The intended extraction of Convolutional Layers from a pre-built model will be transformed to a Vector Representation. This Vector Representation will serve as my input to the Dense Layers I intend training.

The following summarizes the steps I Intend to take:

Step 1 ==> Investigating a sample image on a pre-built model
Step 2 ==> Training and evaluating a base model
Step 3==> Hyperparameters Tuning of the base model
Step 4==> Data Augmentation
Step 5==> Training a model on a large image size
Step 6==> Using the best performed model

As I’ve wetted your appetite with the broad overview of my action plans for this project, let me proceed to the project execution proper. Follow me!

Step 1: Investigating a sample image on pre-built models

The following processes will be followed:

Load a sample image for initial investigation.
Check if the pre-built application in Keras serves the purpose of this project, by doing the following:
> — Supply loaded sample image to the application
> — Make a prediction on the image with the application
> — Check the output of its prediction, most especially the class prediction
> — Conclude if it meets the project's peculiar case
> — And if it doesn’t, proceed to the next step

For the initial investigation and the entire lifecycle of the project, Xception model will be adopted. Since, of all the pre-built models in Keras, it has a better accuracy score, is small in size, and its runtime on both CPU and GPU is effective.

Note: Xception is a pre-built application in Keras. It was trained on Imagenet dataset with 1000 image classes. Imagenet has various combinations of different kinds of image objects. So, it can be used for general-purpose image classification. This informed my decision to choose Xception for this project.

Step 1 Framework

#importing the libraries
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt

import tensorflow as tf 
from tensorflow import keras 

from tensorflow.keras.preprocessing.image import load_img 
from tensorflow.keras.applications.xception import Xception
from tensorflow.keras.applications.xception import preprocess_input
from tensorflow.keras.applications.xception import decode_predictions
from tensorflow.keras.preprocessing.image import ImageDataGenerator

Testing the below image on Xception application

# getting the file path 
path = 'data/images'
name = '0560.jpg'
fullname = f'{path}/{name}'

# load and resize
img = load_img(fullname, target_size=(299, 299))

# convert the image to a numpy array 
x = np.array(img)
# creating an array for the sample image 
X = np.array([x])
# preprocessing the input sample image
X = preprocess_input(X)

# initializing the xception model 
# (setting it's weights == imagenet), since it was on imagenet dataset 
model = Xception(weights='imagenet', input_shape=(299, 299, 3))

# making prediction on the sample preprocessed image 
pred = model.predict(X)

# decoding the model prediction 
decode_predictions(pred)

Looking at the result of the predictions, we can see that the model prediction is quite close to the sample image class. However, the model itself was trained on 1000 classes, which is outside the scope of this project. We intend training a model with a separate 6 classes, which the initial pre-built model didn’t capture.

To this end, it means that we can’t use the Imagenet model for this project.

Going forward, I’ll train different model with the classes we need for the project's particular case. To make this process seamless, transfer learning will be adopted to this project use-case.

Step 2: Training a base model

Training a different model (base model) on the six classes the pre-built model was not trained on. The intentions for this process are as follows:

>> — Loading the project datasets
>> — Split the dataset to training and validation sets
>> — Reuse Xception model
>> — Extract Convolutional Layers Xception model
>> — Convert the extracted CL’s to a Vector Representation
>> — Supply the `Vector Representation` as input to the new Dense Layers I intend training
>> — Train the Dense Layers using the Vector Representation
>> — Train the model using the accompany parameters of learning.

# load the training csv
# convert id column datatype to string 
df_train_full = pd.read_csv('data/train.csv', dtype={'Id': str})
# making another column (named filename)
## this contains the images file path using id's in the train csv 
df_train_full['filename'] = 'data/images/' + df_train_full['Id'] + '.jpg'

# splitting the dataset (80% train and 20% validation)
val_cutoff = int(len(df_train_full) * 0.8)
df_train = df_train_full[:val_cutoff] 
df_val = df_train_full[val_cutoff:]

# read and generate the train and validation data 
## Note: preprocessing_function parameter will be set to preprocess_input module
## used for Xception model
############## Training Data ####################
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_gen = train_datagen.flow_from_dataframe(
    df_train,
    x_col='filename',
    y_col='label',
    target_size=(150, 150),
    batch_size=32,
)
############ Validation Data ####################
val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
val_gen = val_datagen.flow_from_dataframe(
    df_val,
    x_col='filename',
    y_col='label',
    target_size=(150, 150),
    batch_size=32,
)

I want to train and initialize the base model with its accompanying parameters.

Things to Note:
- Xception is the Keras application that we intend to adopt
- Since the Xception was trained on Imagenet dataset, the weights parameter will be set to ‘Imagenet’
- The project focuses on transfer learning, so, in the Xception model, its Dense layers will be removed by setting `include_top` parameter as `False`
- Input_shape of the images will be set to `(150, 150, 3)`:- help training the model faster
- For the model not to be trained all over again, the base model will be set to `False`
- The output of the convolutional layers of the base model will be transformed to Vector Representation using `Pooling`
- Then the output of the Dense Layers (which we intend training) will be set to `6` (since the project classes are 6) using Transform Vectors as input

# base model
base_model = Xception(
    weights='imagenet',
    include_top=False,
    input_shape=(150, 150, 3)
)
base_model.trainable = False
# input dependency
inputs = keras.Input(shape=(150, 150, 3))
base = base_model(inputs, training=False)
# converting the CL's to Vector representation
vectors = keras.layers.GlobalAveragePooling2D()(base) 
outputs = keras.layers.Dense(6)(vectors)
model = keras.Model(inputs, outputs)

######### Model Optimizer ############
# Using Adam optimizer with learning rate at 0.01 
optimizer = keras.optimizers.Adam(learning_rate=0.01)

######## Model Loss #########
# Using CategoricalCrossentropy
# Setting logits to True, since we want categorical result not probability
loss = keras.losses.CategoricalCrossentropy(from_logits=True)

######## Model Compiler #########
# Compiling the model with optimizer, loss, and metrics evaluation 
model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy'])

# fitting and training the model with 10 epoch 
history = model.fit(
    train_gen,
    epochs=10,
    validation_data=val_gen
)

Result Visualization

# plotting the performance for each epoch 
plt.plot(history.history['accuracy'], label='train')
plt.plot(history.history['val_accuracy'], label='validation')
plt.legend()

From the above visuals, there’s a great gap between the train and validation set accuracy. You can see that the model performance had its best from epoch 6 to 7 with a corresponding increase in val_loss scores. The model had its lowest at epoch 5. Model performance can still be improved upon to achieve the project accuracy milestone, having in mind the contribution of a lower val_loss score. And going with this information gained from the base model performance, we’ll do hyperparameter tuning on some of the model parameters.

Step 3: Hyperparameters Tuning of the base model

Experiment by adjusting the learning rate with different values of learning rates (best learning rate will be chosen)
Experiment by adding inner dense layers to the networks
— Experiment by regularizing (freezing out) part of the images to enhance its generalization learning. Note, dropout will be introduced in this aspect with different values.
Checkpointing the processes to save the best model

We should know that hypertuning of parameters isn’t a gateway for a bad model to become good. What it does is to improve on its performance in comparison to when no form of tuning was done.

Experimenting the model learning rate with different values as against the model performance on each learning rate value (best learning rate will be chosen)

# defining a function 

def make_model(learning_rate=0.01):
  base_model = Xception(
      weights='imagenet', 
      include_top=False, 
      input_shape= (150, 150, 3)
  )

  base_model.trainable = False

  ################################################

  inputs = keras.Input(shape=(150, 150, 3))
  base = base_model(inputs, training=False)
  vectors = keras.layers.GlobalAveragePooling2D()(base)
  outputs = keras.layers.Dense(6)(vectors)
  model = keras.Model(inputs, outputs)

  ################################################

  optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
  loss = keras.losses.CategoricalCrossentropy(from_logits=True)

  model.compile(optimizer=optimizer, 
                loss=loss, 
                metrics=['accuracy'])

  
  return model

# iterating over different values of learning rate 

scores = {}

for lr in [0.0001, 0.001, 0.01, 0.1]:
  print(lr)

  model = make_model(learning_rate=lr)
  history = model.fit(train_gen, epochs=10, validation_data=val_gen)
  scores[lr] = history.history

  print()
  print()

Performance visual

From the visual, we can see the learning rate that performed better of all the experimented learning rates was `lr ==> 0.001`. However, `lr ==> 0.0001` can be noted for more complex training. The validation accuracy visual interprets that 0.001 lr got to its highest and best performance at `epoch 4`. It also performed better judging from its `Validation Loss scores`. The model got a leap from its previous performance of an earlier trained base model.

With these empirical shreds of evidence, going forward, learning rate will be set at `0.001`. Is it safe to pend our experimentation here? No, the model performance can still be improved upon. Let’s forge ahead!

Experiment by adding inner dense layers to the networks

The idea behind this is to add another layer (inner dense layer) between the input vector representation and the output dense layer. What this does is to do some intermediate processing of the input vector representation, and the output of this layer will be transmitted to output dense layer.

We want the model to be more powerful and learn some internal representation from its supplied input (VR)

To-do:
- Using different sizes of inner dense layers (best combination will be chosen)
- Setting the learning rate to `0.001`
- Add activation function (in this case `RELU`(Rectified Linear Unit)) to the inner layer. This will process the output of the dense vector
- Experiment by regularization and dropout
- Visualize the performance
- Checkpoint the best combination

Into the investigation, different techniques were used to determine the best approach on improving the model performance. To avoid a long note and time-consuming article, those processes will not be highlighted here. For a comprehensive understanding of these processes, click here to the project notebook to have a broad view of the project hyperparameter tuning processes.

Below summarizes the experimentation of adding inner dense layers together with their accompany parameters after much investigation.

# defining a function 

def make_model(learning_rate=0.0001, droprate=0.2):
  base_model = Xception(weights='imagenet', 
                        include_top=False, 
                        input_shape= (150, 150, 3))

  base_model.trainable = False

  ############## Vector Representation ###########################

  inputs = keras.Input(shape=(150, 150, 3))
  base = base_model(inputs, training=False)
  vectors = keras.layers.GlobalAveragePooling2D()(base)

  ############## Inner Dense Layer #################

  first_inner = keras.layers.Dense(units=256, activation='relu')(vectors)
  first_drop = keras.layers.Dropout(droprate)(first_inner)
  second_inner = keras.layers.Dense(units=128, activation='relu')(first_drop)
  second_drop = keras.layers.Dropout(droprate)(second_inner)

  ############## Output ##################

  outputs = keras.layers.Dense(6)(second_drop)
  model = keras.Model(inputs, outputs)

  ################################################

  optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
  loss = keras.losses.CategoricalCrossentropy(from_logits=True)

  model.compile(optimizer=optimizer, 
                loss=loss, 
                metrics=['accuracy'])

  
  return model
# training the model
model = make_model()
history = model.fit(
    train_gen, 
    epochs=20, 
    validation_data=val_gen)

Performance visual

# plotting the performance for each epoch 
plt.plot(history.history['accuracy'], label='train')
plt.plot(history.history['val_accuracy'], label='validation')
plt.legend()

The iterated investigations of inner dense layers show in their combined reports that there’s a slight improvement in the model performance. However, there’s still presence of a wide gap between accuracy of training and validation sets. The question now is, can these performances still be improved upon and have a closer rank in the accuracy of training and validation? This question begs for an answer. One of such technique we can deploy is Data Augmentation. Can more data improve the performance of the model? Let’s investigate this!

Step 4: Data Augmentation

The idea behind Data Augmentation is to create more data variation from the existing dataset. This will determine if creating more data from the existing dataset can help improve the performance of the model.

When you check the project notebook, you would see that we deployed dropout hyperparameter tuning in regularising the model to avoid the model overfitting. Another technique that can be used inplace of dropout is Data Augmentation. This avoids cases where the model would see similar images while training, and also defeats cases where the model would memorize.

To do this, I’ll be following the below image transformation guidelines:
- Flip ==> horizontal, vertical flipping, or both.
- Rotation
- Shifting ==> up, down, right, and left
- Shear
- Zoom (in/out)

The augmentation techniques that would be used would be based on the image use-case for this project

After much investigation on each augmentation technique, I discovered the below parameters and their accompanying arguments to have improved the model performance:

rotation_range = 30
shear_range = 20
width_shift_range = 10
height_shift_range = 10
zoom_range = 0.1

Check the notebook, for comprehensive understanding of the processes.

# defining a function 

def make_model(learning_rate=0.0001, droprate=0.2):
  base_model = Xception(weights='imagenet', 
                        include_top=False, 
                        input_shape= (150, 150, 3))

  base_model.trainable = False

  ############## Vector Representation ###########################

  inputs = keras.Input(shape=(150, 150, 3))
  base = base_model(inputs, training=False)
  vectors = keras.layers.GlobalAveragePooling2D()(base)

  ############## Inner/Hidden Dense Layer #################

  first_inner = keras.layers.Dense(units=256, activation='relu')(vectors)
  first_drop = keras.layers.Dropout(droprate)(first_inner)
  second_inner = keras.layers.Dense(units=128, activation='relu')(first_drop)
  second_drop = keras.layers.Dropout(droprate)(second_inner)

  ############## Output ##################

  outputs = keras.layers.Dense(6)(second_drop)
  model = keras.Model(inputs, outputs)

  ################################################

  optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
  loss = keras.losses.CategoricalCrossentropy(from_logits=True)

  model.compile(optimizer=optimizer, 
                loss=loss, 
                metrics=['accuracy'])

  
  return model

# investigating with different droprates 

scores = {}
for droprate in [0.0, 0.2]:
  print(droprate)
  model = make_model(
      learning_rate=learning_rate, 
      droprate=droprate)
  history = model.fit(
      all_train_gen, 
      epochs=20, 
      validation_data=val_gen)
  scores[droprate] = history.history
  print()
  print()

Performance visual

plt.figure(figsize = [9, 7])

# droprate 0.0
plt.subplot(2, 1, 1)
hist = scores[0.00]
droprate = 0.0
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.0')
plt.legend();


# droprate 0.2
plt.subplot(2, 1, 2)
hist = scores[0.20]
droprate = 0.2
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.2')
plt.legend();

The above case shows that the model isn’t stable judging from its performance for both droprates at 0 and 0.2. Can we still make the performance better? Let’s train a larger model with large image size.

Step 5: Training a model on large image sizes

When we started training the model, we trained the model on input image size of 150x150. Our main aim in choosing the image size was to train faster model. After training and hyperparameter tuning, we still haven’t achieved our pre-set milestone. What can we do to achieve this? I decided to train a much slower model, almost double the training time of the earlier trained model.

This model will be trained on image size 299x299

TO-DO:
- Training without data augmentation
- Training with data augmentation
- Checkpoint the best performed model

Training without data augmentation

input_size = 299
############## Data Augmentation Training Data ####################
large_train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input
    )
larger_train_gen = large_train_datagen.flow_from_dataframe(
    df_train,
    x_col='filename',
    y_col='label',
    target_size=(input_size, input_size),
    batch_size=32,
)
############ Validation Data ####################
val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
val_gen = val_datagen.flow_from_dataframe(
    df_val,
    x_col='filename',
    y_col='label',
    target_size=(input_size, input_size),
    batch_size=32,
)
# defining a function 
def make_model(input_size = 299, learning_rate=0.0001, droprate=0.2):
  base_model = Xception(weights='imagenet', 
                        include_top=False, 
                        input_shape= (input_size, input_size, 3))

  base_model.trainable = False
  ############## Vector Representation ###########################
  inputs = keras.Input(shape=(input_size, input_size, 3))
  base = base_model(inputs, training=False)
  vectors = keras.layers.GlobalAveragePooling2D()(base)
  ############## Inner/Hidden Dense Layer #################
  first_inner = keras.layers.Dense(units=256, activation='relu')(vectors)
  first_drop = keras.layers.Dropout(droprate)(first_inner)
  second_inner = keras.layers.Dense(units=128, activation='relu')(first_drop)
  second_drop = keras.layers.Dropout(droprate)(second_inner)
  ############## Output ##################
  outputs = keras.layers.Dense(6)(second_drop)
  model = keras.Model(inputs, outputs)
  ################################################
  optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
  loss = keras.losses.CategoricalCrossentropy(from_logits=True)

  model.compile(optimizer=optimizer, 
                loss=loss, 
                metrics=['accuracy']) 
  return model

# investigating with different droprates 
scores = {}

for droprate in [0.0, 0.2]:
  print(droprate)

  model = make_model(
      input_size = input_size, 
      droprate=droprate)
  history = model.fit(
      larger_train_gen, 
      epochs=20, 
      validation_data=val_gen)
  scores[droprate] = history.history
  print()
  print()

Performance visuals (Accuracy and Loss)

plt.figure(figsize = [9, 7])
# droprate 0.0
plt.subplot(2, 1, 1)
hist = scores[0.00]
droprate = 0.0
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.0')
plt.legend();
# droprate 0.2
plt.subplot(2, 1, 2)
hist = scores[0.20]
droprate = 0.2
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.2')
plt.legend();

# Loss Score 
plt.figure(figsize = [12, 10])
# droprate 0.0
plt.subplot(2, 1, 1)
hist = scores[0.00]
droprate = 0.0
plt.plot(hist['val_loss'], label=('val=%s' % droprate))
plt.plot(hist['loss'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Loss for Droprate 0.0')
plt.legend();
# droprate 0.2
plt.subplot(2, 1, 2)
hist = scores[0.20]
droprate = 0.2
plt.plot(hist['val_loss'], label=('val=%s' % droprate))
plt.plot(hist['loss'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Loss for Droprate 0.2')
plt.legend();

Unlike when the model was on input image size (150x150), the model learns and performs at its best when trained on larger image size. Comparing its performance both on training and validation, we can agree that there’s closer gap between training and validation accuracy at droprate 0.2%. This also gives us a stable low validation loss. With these results, the milestone for this project has been achieved.

Let’s see if we can see an improvement in the model performance by adding data augmentation to its training.

Training with data augmentation

input_size = 299
############## Data Augmentation Training Data ####################
large_train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input, 
    rotation_range = 30,
    shear_range = 20,
    width_shift_range = 10,
    height_shift_range = 10,
    zoom_range = 0.1
    )
larger_train_gen = large_train_datagen.flow_from_dataframe(
    df_train,
    x_col='filename',
    y_col='label',
    target_size=(input_size, input_size),
    batch_size=32,
)
############ Validation Data ####################
val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
val_gen = val_datagen.flow_from_dataframe(
    df_val,
    x_col='filename',
    y_col='label',
    target_size=(input_size, input_size),
    batch_size=32,
)
# defining a function 
def make_model(input_size = 299, learning_rate=0.0001, droprate=0.2):
  base_model = Xception(weights='imagenet', 
                        include_top=False, 
                        input_shape= (input_size, input_size, 3))

  base_model.trainable = False
  ############## Vector Representation ###########################
  inputs = keras.Input(shape=(input_size, input_size, 3))
  base = base_model(inputs, training=False)
  vectors = keras.layers.GlobalAveragePooling2D()(base)
  ############## Inner/Hidden Dense Layer #################
  first_inner = keras.layers.Dense(units=256, activation='relu')(vectors)
  first_drop = keras.layers.Dropout(droprate)(first_inner)
  second_inner = keras.layers.Dense(units=128, activation='relu')(first_drop)
  second_drop = keras.layers.Dropout(droprate)(second_inner)
  ############## Output ##################
  outputs = keras.layers.Dense(6)(second_drop)
  model = keras.Model(inputs, outputs)
  ################################################
  optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
  loss = keras.losses.CategoricalCrossentropy(from_logits=True)

  model.compile(optimizer=optimizer, 
                loss=loss, 
                metrics=['accuracy']) 
  return model

# investigating with different droprates 
scores = {}
# saving the model (version 2(299))
checkpoint = keras.callbacks.ModelCheckpoint(
    'xception_v2_299_{epoch:02d}_{val_accuracy:.3f}.h5',
    save_best_only=True,
    monitor = 'val_accuracy',
    mode = 'max'
    )
for droprate in [0.0, 0.2]:
  print(droprate)

  model = make_model(
      input_size = input_size, 
      droprate=droprate)
  history = model.fit(
      larger_train_gen, 
      epochs=20, 
      validation_data=val_gen,
      callbacks = [checkpoint])
  scores[droprate] = history.history

  print()
  print()

plt.figure(figsize = [9, 7])
# droprate 0.0
plt.subplot(2, 1, 1)
hist = scores[0.00]
droprate = 0.0
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.0')
plt.legend();
# droprate 0.2
plt.subplot(2, 1, 2)
hist = scores[0.20]
droprate = 0.2
plt.plot(hist['val_accuracy'], label=('val=%s' % droprate))
plt.plot(hist['accuracy'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Accuracy for Droprate 0.2')
plt.legend();

# Loss Score 
plt.figure(figsize = [12, 10])
# droprate 0.0
plt.subplot(2, 1, 1)
hist = scores[0.00]
droprate = 0.0
plt.plot(hist['val_loss'], label=('val=%s' % droprate))
plt.plot(hist['loss'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Loss for Droprate 0.0')
plt.legend();
# droprate 0.2
plt.subplot(2, 1, 2)
hist = scores[0.20]
droprate = 0.2
plt.plot(hist['val_loss'], label=('val=%s' % droprate))
plt.plot(hist['loss'], label=('train=%s' % droprate))
plt.title('Training Vs Validation Loss for Droprate 0.2')
plt.legend();

Performance Visuals (Accuracy and Loss)

This confirms that using 299x299 image size together with some data augmentation techniques makes the model performs better. The model that I’ll be deploying will be a model with a validation accuracy of 0.967, validation loss score of 0.1101, and where drop-rate was 0.0 at epoch 15. With the usage of this model, when compare the accuracy score of the training and validation, we can see that the model is not learning much faster on the training data than the validation dataset. Thus, making the scores to be closer in rank.

Now that we’ve built a model that has high accuracy, what next? Let’s do some testing on the model!

Step 6: Using/Testing the best-performed model

TO-DO:

Loading the model
Evaluating the model on some images
Getting predictions

# Loading the model 
model = keras.models.load_model('xception_v2_299_15_0.967.h5')

# getting the file path of the sample images for testing
path = 'data/images'
glass = '0560.jpg'
plate = '0014.jpg'
knife = '0013.jpg'
cup = '0009.jpg'
spoon = '0005.jpg'
fork = '0036.jpg'

glasspath = f'{path}/{glass}'
platepath = f'{path}/{plate}'
knifepath = f'{path}/{knife}'
cuppath = f'{path}/{cup}'
spoonpath = f'{path}/{spoon}'
forkpath = f'{path}/{fork}'

# loading and resizing the images
glass_img = load_img(glasspath, target_size=(299, 299))
plate_img = load_img(platepath, target_size=(299, 299))
knife_img = load_img(knifepath, target_size=(299, 299))
cup_img = load_img(cuppath, target_size=(299, 299))
spoon_img = load_img(spoonpath, target_size=(299, 299))
fork_img = load_img(forkpath, target_size=(299, 299))

# image preprocessing

# glass image
x1 = np.array(glass_img)
X1 = np.array([x1])
X1 = preprocess_input(X1)
# plate image
x2 = np.array(plate_img)
X2 = np.array([x2])
X2 = preprocess_input(X2)
# knife image
x3 = np.array(knife_img)
X3 = np.array([x3])
X3 = preprocess_input(X3)
# cup image
x4 = np.array(cup_img)
X4 = np.array([x4])
X4 = preprocess_input(X4)
# spoon image
x5 = np.array(spoon_img)
X5 = np.array([x5])
X5 = preprocess_input(X5)
# fork image
x6 = np.array(fork_img)
X6 = np.array([x6])
X6 = preprocess_input(X6)

Model Predictions

# getting the image classes 
classes = ['cup', 'forks', 'glass', 'knife', 'plate', 'spoon']

From all the predictions of the model, the model is 100% accurate. We can rest assured that the model can now be deployed.

Now that we’ve built a working model that its prediction accuracy stands at 96%, what next?

What we’ve only done is finish the first phase of the project workflow. A model built without deploying to production is equally useless. At this point, if the model is not deployed, John should just accept his faith because no solution is on the way. It wouldn’t be that bad if John fast for a day, then the next day, he can find an alternative to his hungriness.

However, for the benefit of drawing water from a dry well, this project workflow will proceed to the next phase. Creating a solution around this model will be both beneficial to the solution provider and solution receiver depending on the business and financial acumen of the solution provider.

To learn more about the model deployment processes, check here.

CNN, John’s Life Saver! (Kitchenware Image Classification Project)

Project Introduction (Here, I depicted John’s situation that spurs innovation)

Project Statement

Project Data Source

Project Dataset Overview

Model Training materials

Project Workflow

Step 1: Investigating a sample image on pre-built models

Step 2: Training a base model

Step 3: Hyperparameters Tuning of the base model

Step 4: Data Augmentation

Step 5: Training a model on large image sizes

Step 6: Using/Testing the best-performed model

Written by Mustapha Adedayo Alude