An Image Classifier With Keras

Enric Trillo
The Startup
Published in
13 min readSep 9, 2020

I have recently graduated from university this summer in the midst of the current global situation (big ups to Class of 2020!), and for my Artificial Intelligence module (CSY3025), I had to create an image classifier that would be able to classify fruits & vegetables. In this article, I will show you how to create your own Convolutional Neural Network (CNN) and I will explain every step I took to successfully complete my AI project.

Keras — A Python Neural Network Library

Creating the workspace

I used Google Colab, a cloud-based environment similar to Jupyter Notebook, to create a notebook which would contain all the code for the project. After creating the notebook, I then changed the notebook’s settings by setting the Hardware Accelerator to GPU, which can be done by clicking on Edit and then Notebook settings. It will also be nice if you create a folder on your desktop that will be used later, in which you will import and export files to e.g. image_classifier. If you want to code along, I’d advise you start your runtime now and run each cell as we go, or you can just Run All cells at the end.

Importing the dataset and required packages

To import the dataset we will be using, you will have to register an account on Kaggle. Once your Kaggle account has been created, access the My Account page and scroll down until you see the API section. Click the Create New API Token, which should generate a Kaggle file that you can save into the folder that we created earlier. Now that we have the API token in our image_classifier directory, we can now start working on our project! (*inserts everybody say yeah meme*)

Kaggle API section

Next, place the following in your code cell. We will be separating the code into different cells to be able to identify errors better. This will allow you to import the API token we obtained earlier into the workspace when we run all cells at a later time. This is my personal preferred approach, as it will allow us to download the dataset through the notebook, without having to download the dataset to our local area and then upload it to the notebook.

# import files
files.upload()

To make it easier to follow and so I don’t repeat myself a lot, I’d advise you create a new code cell per code snippet you see from now on. Create a new code cell and let’s modify the permissions of the API file.

# modifying the permission of the kaggle file!mkdir -p ~/.kaggle!cp kaggle.json ~/.kaggle/!chmod 600 ~/.kaggle/kaggle.jsonprint('[INFO] API token permission modified!')

Next, we will be downloading the Fruits 360 dataset on Kaggle by Mihai Oltean. The print statements will allows us to see everything that is happening and provide some output to us like a terminal.

# downloading the dataset!kaggle datasets download -d moltean/fruitsprint('[INFO] dataset downloaded!')

Now, we will extract the contents from the zipped dataset we have just downloaded.

# extracting the contents of the zipped dataset from the kaggle source!unzip -q fruits.zip
print('[INFO] dataset unzipped!')

We have our dataset ready, let’s import the required packages!

# importing required packagesfrom keras import * # importing everything from keras (mainly for the Keras models, layers etc...)from keras.applications.vgg19 import VGG19, preprocess_input # importing the pretrained modelimport datetime # to calculate the training durationimport os # to operate with the file system of the projectimport pandas as pd # to be used along the confusion matriximport seaborn as sn # to be used along the confusion matrixfrom imutils import paths # to operate with the path structure of the projectimport matplotlib.pyplot as plt # to plot the results over the given training durationfrom sklearn.metrics import confusion_matrix, classification_report # to produce a confusion matrixfrom keras.preprocessing.image import ImageDataGenerator # to preprocess the image data before feeding it into the modelfrom numpy import argmax # to return the indices in max element in an array (used in the confusion matrix)

Setting up the Training and Test directories

Before we go any further, make sure that the contents of the zipped file have been extracted completely. Click the folder icon on the left panel, and check that the Fruits-360 folder shows all the classes available and their samples.

The following code snippet will locate the Training and Test directories from the dataset and also check the number of classes in the dataset, as the dataset is updated often. Using len(labels) will also help us later, as it will place the number of classes automatically by reading the number of folders in the training directory, without us having to count the amount of subdirectories under the Training folder.

# locating the training directory
train_path = '/content/fruits-360/Training'
# locating the testing directory
test_path = '/content/fruits-360/Test'
use_label_file = False # this will be set to true if the label names from a file are to be loaded; it uses the label_file defined below; the file should contain the names of the used labels on a separate linelabel_file = 'labels.txt'output_dir = 'output_files'if not os.path.exists(output_dir):
os.makedirs(output_dir)
if use_label_file:
with open(label_file, "r") as f:
labels = [x.strip() for x in f.readlines()]
else:
labels = os.listdir(train_path)
# prints the names of all classes
print(labels)
# prints the number of classes available
len(labels)

Sample Checking

This is essentially checking the number of samples in the Training and Test folders. This can be useful for when we perform a validation split — remember this for later.

# directoriestotalTrain = len(list(paths.list_images(train_path)))totalTest = len(list(paths.list_images(test_path)))total = totalTrain + totalTest# outputprint("[INFO] There are "+str(totalTrain)+ " samples in the Train path")print("[INFO] There are "+str(totalTest)+ " samples in the Test path")print("[INFO] There are a total of "+str(total)+ " samples")

Setting Global Parameters and Functions

Let’s set up the global parameters. These global parameters allow for easier modification if you want to use different values, like a different optimizer and learning rate. This code snippet does the following: sets the image_size variable to 100 because each image is 100 by 100 pixels big, the batch_size will take 128 samples from the dataset, frozen_epochs are the amount of epochs the initial model will be trained by, and trained_epochs are the amount of epochs the final model will be trained by.

# setting global parameters for easier modificationprint("[INFO] Setting up global parameters...")image_size = 100batch_size = 128frozen_epochs = 40trained_epochs = 2opt = optimizers.Adam(lr=1e-5)

Next, we will declare a build_data_generators function, which will generate our training, validation, and testing data generators. We are also implementing data augmentation to the training samples. We are performing pixel scaling, shifting (height, width and zoom) and flipping (horizontal and vertical), which will reduce the chances of the CNN from seeing the same image twice and provide more variety.

# a function which generates the data generators for TVTprint("[INFO] building data generators...\n")def build_data_generators(train_folder, test_folder, validation_percent, labels=None, image_size=(image_size, image_size), batch_size=batch_size):train_datagen = ImageDataGenerator(rescale=1./255, #pixel scalingwidth_shift_range=0.1, # randomly width shift up to 0.1height_shift_range=0.1, # randomly height shift up to 0.1zoom_range=0.2, # randomly zoom images up to 0.2horizontal_flip=True, # randomly flip images horizontallyvertical_flip=True,  # randomly flip images verticallyvalidation_split=validation_percent) # validation settest_datagen = ImageDataGenerator(rescale=1./255)train_gen = train_datagen.flow_from_directory(train_folder,target_size=(image_size, image_size),class_mode='sparse',batch_size=batch_size,shuffle=True,subset='training',classes=labels)val_gen = train_datagen.flow_from_directory(train_folder,target_size=(image_size, image_size),class_mode='sparse',batch_size=batch_size,shuffle=False,subset='validation',classes=labels)test_gen = test_datagen.flow_from_directory(test_folder,target_size=(image_size, image_size),class_mode='sparse',batch_size=batch_size,shuffle=False,subset=None,classes=labels)return train_gen, val_gen, test_gen

The following function will allow us to plot a confusion matrix. This matrix will be used to evaluate the model’s accuracy at recognising different classes.

# a function to plot the confusion matrixdef plot_confusion_matrix(y_true, y_pred, classes, out_path=""):cm = confusion_matrix(y_true, y_pred)df_cm = pd.DataFrame(cm, index=[i for i in classes], columns=[i for i in classes])plt.figure(figsize=(40, 40))ax = sn.heatmap(df_cm, annot=True, square=True, fmt="d", linewidths=.2, cbar_kws={"shrink": 0.8})if out_path:plt.savefig(out_path)  # saving the matrix as a file in the output directoryreturn ax

And now, we will initialise our data generators. Remember the validation split I mentioned earlier? This will be used to set apart a specific amount of samples from the training samples as validation samples. In this case, the validation split is set to 0.3 (or 30%), which means that the samples in the Training directory are 70% training and 30% validation. This feature is useful when you have datasets with no specific validation samples like the one we are using now.

# initialising the generatorsprint("[INFO] initialising data generators...")train_gen, val_gen, test_gen = build_data_generators(train_path,test_path,validation_percent=0.3,labels=labels,image_size=image_size,batch_size=batch_size)

Creating the model structure

We are initialising the convolutional base with no classifier, also known as network surgery, as we will be attaching our custom classifier onto our model. This convolutional base is the VGG19 model that was pre-trained on ImageNet data.

# loading the VGG19 networkconv_base = VGG19(weights='imagenet',include_top=False,input_shape=(image_size, image_size, 3))

Next, we will initialise our model by connecting the base and our custom classifier together.

# creating the modelmodel = models.Sequential()# adding the VGG19 conv base to the modelmodel.add(conv_base)model.add(layers.Dropout(0.4))model.add(layers.Flatten())  # this converts our 3D feature maps to 1D feature vectorsmodel.add(layers.Dense(256, activation='relu', name='fc_1'))model.add(layers.Dense(128, activation='relu', name='fc_2'))model.add(layers.Dense(len(labels), activation = "softmax", name='predictions'))

Let’s check how many parameters are available now.

# checking the number of trainable parameters - pre-frozen convolutional basemodel.summary()

Now let’s freeze our VGG19 base. This will prevent the weights from the convolutional base from being updated during training.

# freezing the convolutional baseconv_base.trainable = False

Let’s check again how many trainable parameters are available now. What we’ve just done is freeze the conv_base, which will allow us to train the model’s classifier. This is because the pre-trained base we are using is already trained, whilst this classifier has not, and training them together like this would destroy the weights the conv_base learnt previously. So, we will be training our custom classifier first, then perform some fine-tuning.

# checking the number of trainable parameters - post-frozen convolutional basemodel.summary()

Training the custom classifier

Let’s compile our model. Remember how we set the optimizer attribute to opt? This will also be used on the final model, and it allows us to modify the optimizer for both stages of the model, from the global parameters we declared earlier.

# compiling the modelprint("[INFO] compiling model...")model.compile(loss='sparse_categorical_crossentropy',optimizer=opt,metrics=['acc'])

Now we are at the juicy part, let’s train our classifier! We’re using the datetime package we imported earlier to get the training duration for the frozen stage of the model.

# training the model & getting the training durationprint("[INFO] training model...\n")start = datetime.datetime.now()history = model.fit_generator(train_gen,steps_per_epoch=(len(train_gen.filenames)//batch_size + 1),epochs=frozen_epochs,validation_data=val_gen,validation_steps=(len(val_gen.filenames)//batch_size + 1),verbose=1)end = datetime.datetime.now()elapsed = end-start

Print out the time it took to train the model.

print('training duration: ', elapsed)

Visualising classifier model results

Now that we have trained the model, let’s visualise the results, instead of just seeing lines of code displaying the results achieved per epoch.

# plotting performanceacc = history.history['acc']val_acc = history.history['val_acc']loss = history.history['loss']val_loss = history.history['val_loss']epochs = range(1, len(acc) + 1)plt.plot(epochs, acc, 'blue', label='Training acc')plt.plot(epochs, val_acc, 'red', label='Validation acc')plt.title('Classifier: accuracy')plt.legend()plt.figure()plt.plot(epochs, loss, 'blue', label='Training loss')plt.plot(epochs, val_loss, 'red', label='Validation loss')plt.title('Classifier: loss')plt.legend()plt.show()

This should display two charts visualising our accuracy and loss results. Here are mine.

Frozen Base Accuracy Results (left), Frozen Base Loss Results (right)

Let’s evaluate the validation and testing accuracies of the model.

val_gen.reset()loss_v, accuracy_v = model.evaluate(val_gen, steps=(val_gen.n / batch_size) + 1, verbose=1)loss, accuracy = model.evaluate(test_gen, steps=(test_gen.n / batch_size) + 1, verbose=1)print("[INFO] Validation: accuracy = %f  ;  loss_v = %f" % (accuracy_v, loss_v))print("[INFO] Test: accuracy = %f  ;  loss_v = %f" % (accuracy, loss))
Validation and Test Accuracy Evaluation — Classifier Model

This resulted in my classifier model achieving 82.43% validation accuracy and 83.6% testing accuracy. We can push this further by performing some fine-tuning. In the meantime, let’s run up the confusion matrix to visually see how accurate our model is at classifying the produce as of now.

#running predictionsprint('[INFO] running predictions for classifier model...\n')y_pred = model.predict(test_gen, steps=(test_gen.n // batch_size) + 1, verbose=1)y_true = test_gen.classes[test_gen.index_array]plot_confusion_matrix(y_true, y_pred.argmax(axis=-1), labels, out_path=(output_dir+ "/base_confusion_matrix.png"))class_report = classification_report(y_true, y_pred.argmax(axis=-1), target_names=labels)

As you can see, even we though we achieved around 80% accuracy, there are multiple instances where the model seems to misclassify different types of produce, which can be seen by the coloured cells outside the straight coloured line in the image below. Let’s keep working to improve this further.

For now, store the classifier weights. We will make use of them soon.

print('[INFO] locating output directory...')BASE_PATH = os.path.sep.join(["output_files", "base_weights.h5"])print('[INFO] classifier weights have been stored!')model.save(BASE_PATH)

Initialising fine-tuned model

We will be initialising the same model with a different name, which will hold the pre-trained base’s weights and the weights of our newly trained classifier! This would allow us to pass on the knowledge from the classifier model to this target model.

# FT modeltop_model= models.Sequential()top_model.add(conv_base)top_model.add(layers.Dropout(0.4))top_model.add(layers.Flatten())  # this converts our 3D feature maps to 1D feature vectorstop_model.add(layers.Dense(256, activation='relu', name='fc_1'))top_model.add(layers.Dense(128, activation='relu', name='fc_2'))top_model.add(layers.Dense(len(labels), activation = "softmax", name='predictions'))print('[INFO] fine-tuned model initialised!')

Load the classifier weights we saved before to the target model.

print('[INFO] loading classifier weights...')top_model.load_weights('/content/output_files/base_weights.h5')print('[INFO] classifier weights loaded!')

Now that our classifier weights have been loaded onto the model, let’s do a quick parameter check.

# checking the amount of trainable parameter in the frozen modeltop_model.summary()

Let’s unfreeze our base by setting all conv_base layers to be trainable.

# unfreezing baseconv_base.trainable = True

Now, let’s set the final block of layers in the conv_base to be the only trainable layers, and display the status of each layer in the base.

# reset our data generatorstrain_gen.reset()val_gen.reset()# now that the head FC layers have been trained/initialized,# unfreeze the final block of CONV layers and making them trainablefor layer in conv_base.layers[:10]:layer.trainable = False# loop over the layers in the model and show their trainable statusfor layer in conv_base.layers:print("{}: {}".format(layer, layer.trainable))

Again, let’s perform a quick parameter check.

# checking the amount of trainable parameters in the unfrozen model.top_model.summary()

Compile the fine-tuned model.

# compiling the modelprint("[INFO] recompiling model...")top_model.compile(loss='sparse_categorical_crossentropy',optimizer=opt,metrics=['acc'])

Once the model has been compiled, let’s train the model for a final round! *ding ding ding*

# training the modelprint("[INFO] training FT model...")start = datetime.datetime.now()history = top_model.fit_generator(train_gen,steps_per_epoch=(len(train_gen.filenames) // batch_size) + 1,epochs=trained_epochs,validation_data=val_gen,validation_steps=(len(val_gen.filenames) // batch_size) + 1,verbose=1)end = datetime.datetime.now()elapsed1 = end-start

Visualising fine-tuned model results

Now that the model has finished training, let’s plot the performance history of our fine-tuned model.

# plotting performanceacc = history.history['acc']val_acc = history.history['val_acc']loss = history.history['loss']val_loss = history.history['val_loss']epochs = range(1, len(acc) + 1)plt.plot(epochs, acc, 'blue', label='Training acc')plt.plot(epochs, val_acc, 'red', label='Validation acc')plt.title('Fine Tuned Model: accuracy')plt.legend()plt.figure()plt.plot(epochs, loss, 'blue', label='Training loss')plt.plot(epochs, val_loss, 'red', label='Validation loss')plt.title('Fine Tuned Model: loss')plt.legend()plt.show()

These are the results I achieved. You can see that the model achieves around 98% in training accuracy and 94% validation accuracy.

Fine-Tuned Model Accuracy (left), Fine-Tuned Model Loss (right)

Let’s evaluate the model on validation and test samples.

val_gen.reset()loss_v, accuracy_v = top_model.evaluate(val_gen, steps=(val_gen.n / batch_size), verbose=1)loss, accuracy = top_model.evaluate(test_gen, steps=(test_gen.n / batch_size), verbose=1)print("[INFO] Validation: accuracy = %f  ;  loss_v = %f" % (accuracy_v, loss_v))print("[INFO] Test: accuracy = %f  ;  loss_v = %f" % (accuracy, loss))
Validation and Test Accuracy Evaluation — Fine-tuned Model

This resulted in my fine-tuned model achieving 94.09% validation accuracy and 94.87% testing accuracy! Let’s run the confusion matrix up again to visually see how our model’s learning has improved from the previous round.

#running predictionsprint('[INFO] running predictions for final model...\n')y_pred = top_model.predict(test_gen, steps=(test_gen.n // batch_size) + 1, verbose=1)y_true = test_gen.classes[test_gen.index_array]plot_confusion_matrix(y_true, y_pred.argmax(axis=-1), labels, out_path=(output_dir+ "/ft_confusion_matrix.png"))class_report = classification_report(y_true, y_pred.argmax(axis=-1), target_names=labels)
Fine-Tuned Model Confusion Matrix

We can see a significant improvement from the previous confusion matrix, as there are less instances of misclassification compared before. Let’s see the total duration we’ve spent to train the models to achieve these results.

print('[INFO] Training Model Duration: ', elapsed1)total = elapsed+elapsed1print('[INFO] Total Training Duration: ', total)

We can now store the weights of our final model, which can be used for another project at a later time!

print('[INFO] locating output directory...')TOP_PATH = os.path.sep.join(["output_files", "final_weights.h5"])print('[INFO] final model weights have been stored!')top_model.save(TOP_PATH)

And with that, we have just created our successful image classifier! From freezing the conv_base and performing network surgery to train our custom classifier, to then fine-tuning the model to let it become attuned to the dataset — we have seen the learning accuracy improve, backed by the results we have achieved. We then visualised the results of both training rounds with the help of the figures and confusion matrices to see how accurate the model has gotten from the start of the project. You can save the images of the figures and confusion matrices we plotted earlier, by right-clicking the images and saving them to the image_classifier directory we created at the start.

Future work for this project, would be attempting to add other techniques that would allow the model to classify produce to improve the results we have achieved today.

I welcome feedback and constructive criticism as I’m still learning about AI techniques :)

Link to the notebook repo on my GitHub.

Thank you for reading!

www.enrictrillo.com

--

--

Enric Trillo
The Startup

Fullstack Developer. Writing articles about Disruptive Technologies, AI, Multi Agent Systems etc- https://enrictrillo.com