Learning Image Classification with CNN using TensorFlow and Keras

Published in

CodeX

6 min readFeb 12, 2023

In this article we will work with an image dataset to train an Image classifier using a custom CNN built with TensorFlow and Keras.

PS : For those who don’t already know what is Deep learning or CNN this article may be difficult to understand and unfortunately there is no easier way around this. This article is not meant to be a tutorial about Computer Vision or Deep Learning, For those familiar with these concepts please read on.

Understanding the data and problem at hand :

We will work with a dataset provided here. This dataset is curated nicely, cleaned and arranged collection of roasted coffee beans in train and test folders. There are 1600 images of 224 x 224 pixels and the csv file provided with the dataset contains important information about the images.

Dataset Source: Ontoum, S., Khemanantakul, T., Sroison, P., Triyason, T., & Watanapa, B. (2022). Coffee Roast Intelligence. arXiv preprint arXiv:2206.01841. url = {https://arxiv.org/abs/2206.01841}

Understanding digital imaging :

In digital imaging a pixel is the smallest addressable unit, an image is basically a collection of pixels and they are represented by typically 3 or 4 components with variable intensities. Each pixel is represented by a combination of RGB ( Red, Green and Blue) or CMYK ( Cyan, Magenta, Yellow and Key-Black).

The simplest way to represent an image is a matrix whose size will depend upon the resolution of the image itself. For example in this dataset we are provided with image size 224 x 224 px, hence we will have a 224 x 224 size matrix with each element being represented by it’s intensity which in-turn can be represented with bytes in range 0 to 255. In practice we normalize each of these elements by dividing it by 255 such that these elements lie in the range 0 to 1.

3 matrices of Image size represents the whole color image, 1 for each of the channels R G and B

We will have 3 matrices for color images ( one for each of the channel — Red , Green and Black). Gray scale Images will have only one channel and hence just one matrix of corresponding image size.

Now let’s start by setting up our coding environment and loading our dataset. We are using the superawsome TensorFlow and keras library along with cv2, imgdhr and matplotlib to work with the images.

#Start by setting up coding environment
import warnings,os,cv2,imghdr,matplotlib
import tensorflow as tf
from matplotlib import pyplot as plt

warnings.filterwarnings('ignore')
tf.get_logger().setLevel('INFO')
df = pd.read_csv('../input/coffee-bean-dataset-resized-224-x-224/Coffee Bean.csv')
df

Our CSV file contains information about the images like class index, file path, label name etc.

Let’s see what images are provided in training set.

#setting up the training dir
data_dir = '../input/coffee-bean-dataset-resized-224-x-224/train' 
image_exts = ['.png']
#create an image dataset with the given images
data = tf.keras.utils.image_dataset_from_directory(data_dir)

**tensorflow comes loaded with some great functions for image manipulations and model building**

#itertaor for image iteration
data_iterator = data.as_numpy_iterator()
batch = data_iterator.next()

clear_output()

#Visualise a random batch
fig, ax = plt.subplots(ncols=5, figsize=(20,20))
for idx, img in enumerate(batch[0][:5]):
    ax[idx].imshow(img.astype(int))
    ax[idx].title.set_text(batch[1][idx])

Visualizing a random batch of Images provided in the dataset

Data Preparation:

We will resize images to 50 x 50 from 224 x 224 since our custom CNN model used later works with the same size. We will define the class labels as a list. In the below chunk of code we are resizing the images provided and creating matrices to represent each resized image.

import random

IMG_SIZE = 50

DATADIR = '../input/coffee-bean-dataset-resized-224-x-224/train'

CATEGORIES = ['Dark', 'Green', 'Light','Medium']

for category in CATEGORIES :
    path = os.path.join(DATADIR, category)
    for img in os.listdir(path):
        img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_UNCHANGED)

training_data = []

def create_training_data():
    for category in CATEGORIES :
        path = os.path.join(DATADIR, category)
        class_num = CATEGORIES.index(category)
        for img in os.listdir(path):
            try :
                img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_UNCHANGED)
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
                training_data.append([new_array, class_num])
            except Exception as e:
                pass

create_training_data()

random.shuffle(training_data)

X = [] #features
y = [] #labels

for features, label in training_data:
    X.append(features)
    y.append(label)

X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
y = np.asarray(y)

Now we will create files to store all information about our model

import pickle
# Creating the files containing all the information about our model
pickle_out = open("X.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out.close()

pickle_out = open("y.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()

pickle_in = open("X.pickle", "rb")
X = pickle.load(pickle_in)

Model Building and Training :

The model architecture we are using has 3 convolutional layers, 2 hidden layers and 1 output layer.

– The first two layers are Conv2D with filter-size (3,3) , we use relu as the activation function and reduced in size using a MaxPooling2D (2,2). The second layer creates 64 images of reduced sizes which is further reduced by using MaxPooling2D.
– The third Conv2D also has a (3,3) kernel, takes the 64 images as input and creates 64 outputs which are again reduced in size by a MaxPooling2D layer, and at last regularized using dropout.

- The next 2 layers are hidden dense layers with 128 neurons with relu as activation function.
– The output layer is a 4-node dense layer with softmax activation. Each node represents a class of coffee bean.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D

from keras.models import model_from_json
from keras.models import load_model
import matplotlib.pyplot as plt

# Opening the files about data
X = pickle.load(open("X.pickle", "rb"))
y = pickle.load(open("y.pickle", "rb"))

# normalizing data (a pixel goes from 0 to 255)
X = X/255.0

# Building the model
model = Sequential()
# 3 convolutional layers
model.add(Conv2D(32, (3, 3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))

# 2 hidden layers
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))

model.add(Dense(128))
model.add(Activation("relu"))

# The output layer with 4 neurons for 4 classes
model.add(Dense(4))
model.add(Activation("softmax"))

# Compiling the model using some basic parameters
model.compile(loss="sparse_categorical_crossentropy",
                optimizer="adam",
                metrics=["accuracy"])

# Training the model, with 40 iterations
# validation_split corresponds to the percentage of images used for the validation phase compared to all the images
history = model.fit(X, y, batch_size=32, epochs=40, validation_split=0.1)

# Saving the model
model_json = model.to_json()
with open("model.json", "w") as json_file :
    json_file.write(model_json)

model.save_weights("model.h5")
print("Saved model to disk")

model.save('CNN.model')

# Printing a graph showing the accuracy changes during the training phase
print(history.history.keys())
plt.figure(1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')

We train our model for 40 epochs and visualize the test and validation accuracies.

Near 99 % of model accuracy was achieved

Prediction using trained model:

The model prediction is correct for a randomly picked image

Conclusion:

The objective of this article is to practice what I learnt in the classroom and try working with new datasets to gain more practical insights. In the process of writing this notebook and article I was able to reinforce and solidify a lot of concepts learnt in the classroom, the notebook can be downloaded from the below link.

Coffee bean classification[99%] -CNN | Kaggle

Learning Image Classification with CNN using TensorFlow and Keras

Written by Vijayendra Dwari