DC Comics Logo Classifier

Training an Image classifier from scratch using TensorFlow 2.0

Published in

Analytics Vidhya

5 min readOct 15, 2019

We will be training CNN to classify the logo of a particular character. In this example, I took five different characters namely Batman, Superman, Green Lantern, Wonder Woman and Flash. This will be an end to end article. It includes steps right from collecting data to saving the trained model.

So after giving this image as input, our predicted class will be ‘Batman’

Prerequisites

Knowledge of Python
Google Account: As we will be using Google Colab

So time to get our hands dirty!

First, we will collect data using GoogleImagesDownload, a very handy python package to download images from google search. Now we will download images for each class(here we have five classes as batman, superman, green lantern, wonder woman and flash). Please refer to the documentation about using the tool mentioned in the given link above.

Here’s a link to ChromeDriver, if you face trouble finding it.

googleimagesdownload — keywords “batman logo” — chromedriver chromedrvier — limit 300

I ran the above statement in command prompt to obtain images for each class by changing search keywords. Now I selected files with .jpg extension as it also downloads files with other extensions. I had to manually delete some irrelevant images. Then I renamed the images. I did this for each class. For renaming and selecting only .jpg files I have provided scripts in Github repository. You will just have to take care of paths before executing them.

Finally, I made a folder named data that consisted of images for each class. The hierarchy of directories looked as shown in the snapshot below.

Now upload this folder to your google drive. After uploading this folder to Google Drive, We will create a new colab notebook from this link. Google colab gives us jupyter environment. You can refer to the jupyter notebook in the Github repository. Now we will start with preprocessing and then defining model to train it.

!pip install tensorflow==2.0

So in the first cell, we installed TensorFlow 2.0. Now we will import all the packages we need.

import cv2
import os
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.utils import shuffle
from tensorflow.keras import layers, models
from google.colab import drive
drive.mount('/content/drive')

We are using cv2 for processing images, os for dealing with paths. numpy is used for numpy arrays. TensorFlow will be used for defining and training models. Here I have used shuffle from sklearn.utils to shuffle image data while performing the train-test split. Finally, drive from google.colab will be used to mount google drive on colab notebook.

After the last line of the above cell is executed, It will provide a link that provides verification token. Once the token is given, google drive will be mounted on colab notebook. I have defined two functions loadTrain() and readData(), loadTrain() will help in preprocessing image. Preprocessing of image includes resizing, normalizing and assigning labels to corresponding images.

validationSize = 0.2
imageSize = 128
numChannels = 3
dataPath = "/content/drive/My Drive/comic/data"
classes = os.listdir(dataPath)
numClasses = len(classes)
print("Number of classes are : ", classes)
print("Training data Path : ",dataPath)

Note: Take care of the path of the data folder. Here I kept it under comic.

Here validationSize is given value 0.2, so 80% will be our training data and 20% testing data. imageSize will specify the dimension of image that will be given input to the model. numChannels is given value 3 as our image will be read into RGB channels.

data = readData(dataPath,classes,imageSize,validationSize)

X_train,y_train,names_train,cls_train = data.train.getData()
X_test,y_test,names_test,cls_test = data.valid.getData()

print("Training data X : " , X_train.shape)
print("Training data y : " , y_train.shape)
print("Testing data X : ",X_test.shape)
print("Testing data y : ",y_test.shape)

Now we have our train and test data ready. Time to define our model and train it.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))

There’s a Conv layer followed by max-pooling layer. Here input shape is 128*128*3 as we resized our image to 128*128 resolution and 3 is the number of channels. Then again we have a Conv layer followed by max-pooling. Then again a Conv layer and now tensor is flattened in the next layer. We now have a dense layer connected to our output layer. Here the output layer consists of 5 units as we have five classes for classification.

model.summary()

We get a summary of our defined model. Now it’s time to train.

history = model.fit(X_train,y_train, epochs=4, 
                    validation_data=(X_test,y_test))

I have also mentioned plots and accuracy metrics. You can check it in my jupyter notebook.

model.save("comic.h5")

We save our model in .h5 file, but this is local to colab so we will save it google drive.

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive 
from google.colab import auth 
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()          
drive = GoogleDrive(gauth)
model_file = drive.CreateFile({'title' : 'comic.h5'})                       
model_file.SetContentFile('comic.h5')                       
model_file.Upload()
drive.CreateFile({'id': model_file.get('id')})

So now our trained model will be saved to google drive. It can be easily downloaded from google drive.

Now to classify I have written a script named classify.py. Here we will pass the path of our image as CLI argument. Our output will be predicted class. We actually get probabilities for each class, we will select one with maximum. Sometimes the model makes wrong classification. Working on improving accuracy, right now the accuracy is 80%.

Here predicted the class for the above logo is **Batman**.

Here’s a link to my Github repo.

Further, I will try to write my experience of deploying it as an API on a cloud platform. You can also deploy it in mobile devices by converting model to lite version and saving it to .tflite file, refer TensorFlow Lite documentation for further details. Feel free to connect with me on LinkedIn, Github, and Instagram. Let me know about any improvisation(s).

Thank You!

DC Comics Logo Classifier

Training an Image classifier from scratch using TensorFlow 2.0

Written by Aniruddha Tonge