Creating a CNN using Keras for GTSRB

Aditya Mehrotra
Analytics Vidhya
Published in
6 min readNov 26, 2019

I created a CNN model GTRSB dataset, you can check it out on kaggle here: https://www.kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign.

Context: The GTSRB dataset is a dataset that serves as one of many benchmarks for multiclass classification algorithms. It is comprised of 43 classes, where each class is a different road sign. The goal of the CNN is to classify these images with high accuracy.

In this article, I will be describing my approach, code, and results for this dataset.

My Approach

  1. Examine the data: After looking at the data for 20 mins, I figured that image augmentation wasn’t required. This is because there is substantial data for each class to train my model. Additionally, the training set had a good variance in the types of pictures used for every class (some had more blur and noise than others). Since data augmentation isn’t required here due to the nature of my data, I was able to move on to the next step.
  2. Create my training and testing sets: I created my training and testing sets using the data given. My outputs (Y) are the labels of the images (its class) and my inputs (X) are the array representations of the images. From examining the data, I found out that there are already separate training and testing sets given within the dataset folder, therefore I don’t need to split my data into a training-testing ratio. The order of the set doesn’t matter as we will be shuffling our data when we fit it to our model later on.
  3. Define the architecture of the model: I created the architecture of the CNN model. This architecture is custom-made, as I found that it has worked well for many other simple image tasks like GTSRB. However, this can vary depending on what you want as there are many different architectures out there such as VGG, AlexNet and YOLO. But no matter what architecture you use, the last layer must be a dense layer with the number of output neurons being equal to the number of classes.
  4. Fit the data to the model after setting hyperparameters and wait for results!

Creating training and testing sets:

Step 1: Imports

#import tensorflow as tf (This depends on how you want to configure your GPU
from keras.models import Sequential, save_model
from keras.layers import Conv2D, MaxPool2D, MaxPooling2D, Dense, Dropout, Activation, Flatten
import keras.layers
import cv2 as cv
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os

I imported all the libraries I will need for preprocessing and model initialization at the beginning of my code. I understand that there are some customization features for the local GPU with TensorFlow, so you can import that and configure that if you like.

Step 2: Creating training and testing sets

train_X = []
train_y = []
for i in range(0,43):
n = str(i)
train_Path = "gtsrb-german-traffic-sign/Train/" + n
label = [0 for i in range(0, 43)]
label[i] = 1
for filename in os.listdir(train_Path):
img = cv.imread(train_Path + "/" + filename)
img = cv.resize(img, (28,28))
print(filename)
train_X.append(img)
train_y.append(label)
train_X = np.asarray(train_X)
train_X = train_X/255
train_X = np.asarray(train_X, dtype = "float32")
train_y = np.asarray(train_y, dtype= "float32")

I iterated through all the folders and appended the array representations of the images of dimensions (N, K, 3), where N and K can vary depending on the image’s size. Additionally, I created an array of length 43 (there are 43 classes) of zeroes and converted the index which corresponds to the class of the image to a 1. Since the dataset is zero-indexed, I can simply just update the corresponding index to the class of the image. At the end, I converted my array into the dtype of float32 from uint8. I did this because it’s a relatively safe datatype for training models, however, I scaled my values by 255 beforehand to avoid screwing up my pictures during typecasting. I also resized my images to 28x28 as all the images within the training sets had different sizes, but none were less than 28x28. Therefore, these were the best dimensions to resize by in order to minimize features lost.

counter = 0
test_X = []
test_y = []
test_Path = "gtsrb-german-traffic-sign/Test"
for filename in os.listdir(test_Path):
img = cv.imread(test_Path + "/" + filename)
img = cv.resize(img, (28,28))
label = [0 for i in range(0, 43)]
#6 is the column # within the .CSV file label[df.loc[counter][6]] = 1
print(filename)
test_X.append(img)
test_y.append(label)
counter += 1
test_X = np.asarray(test_X)
test_X = test_X/255
test_X = np.asarray(test_X, dtype = "float32")
test_y = np.asarray(test_y, dtype= "float32")

I utilized the same logic as above to create my testing set, but I have to use a bit of pandas (.loc method) to go through a column of a .CSV file which represents the classes of each picture to create my labels.

Step 3: Create the model

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=train_X.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(392))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(43))
model.add(Activation('softmax'))

I created a simple CNN for this task. I used a mix of convolutional layers and pooling layers to reduce the amount of data I need to plug into my dense network and a bit of dropout to prevent overfitting. I won’t go into detail in how this works in this article, but if you don’t understand what is going on then this is a really good article for intuition: https://adventuresinmachinelearning.com/keras-tutorial-cnn-11-lines/

Step 4: Initialize some hyperparameters and train the model

opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
history = model.fit(train_X, train_y,
batch_size= 16,
epochs= 6,
validation_data= (test_X, test_y),
shuffle=True)

For this task, I used RMSprop with a learning rate of 0.0001 and categorical cross-entropy for my loss. I wanted to use accuracy as a metric because it helps me get a better understanding of the performance of my model.

Results

After running the model, I ended up with an accuracy of 94%, which is fine.

A visual using matplotlib for my accuracy over time
A visual using matplotlib for my loss over time

As you can see, my model converges nicely and the accuracy is pretty good. In order to improve it, what I could try is using Adam optimizer instead of RMSprop or making my model deeper by reducing my Conv layers and increasing the size of or adding more dense layers to keep more features.

What I learned from my project:

  • When I was looking through the dataset, I didn’t think my CNN would be able to differentiate between images that had the same sign shapes but different numbers on the signs to represent speed limits. Turns out the CNN is much more powerful than I thought!
  • I also learned how to iterate through files using the OS package. I previously used regular string manipulation to iterate through files but it turns out that the OS package makes the job a lot easier.

Thanks for reading my article! If I made a mistake somewhere or if I could improve in certain areas, feel free to give me some advice or criticism. I’m only a high school student so I have a lot of room for growth and error. If you have any questions or require clarification, I invite you to comment below.

--

--