German Traffic Sign Recognition Benchmark

Hardik Vagadia
Analytics Vidhya
Published in
12 min readJul 20, 2020

Using Deep-learning and Computer vision to accurately detect and classify traffic sign images.

Table of contents :

  1. Business/ Real-world constraints.
  2. Exploratory data analysis.
  3. Setting up data input pipeline.
  4. Model definition and training.
  5. Predicting output on test data.
  6. Training Faster-RCNN.
  7. Combining Faster-RCNN with our custom model.
  8. Conclusion and future works.
  9. Useful references.

1- Business/ Real-world constraints :

1.1- Problem statement :

  • Traffic signs are an integral part of our road infrastructure. Without such useful signs, we would most likely be faced with more accidents, as drivers would not be given critical feedback on how fast they could safely go, or informed about road works, sharp turn, or school crossings ahead.
  • Naturally, autonomous vehicles must also abide by road legislation and therefore recognize and understand traffic signs.
  • The goal of traffic sign detection is to identify the region of interest (ROI) in which a traffic sign is supposed to be found and verify the sign after a large-scale search for candidates within an image.
  • It has a very major role to play in self-driving cars, which is the future of automobile industry.
what your car sees … (Tesla Autopilot Full Self-Driving)
  • The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011.
  • The data-set can be downloaded from this Kaggle link.
  • All the files related to this case-study can be found at this GitHub repo.

1.2- Objectives and constraints :

  • Each traffic sign should be correctly identified. Hence, multi-class log-loss should be improved.
  • Weighted F1-Score will be used to judge classification performance and Mean-Squared error(MSE) for bounding box detection performance.
  • Also, time is a major constraint here because a delay of even a second can be a matter of life and death.

2- Exploratory data analysis :

2.1- Folder structure :

  • The data-set contains Train and Test folder. Train folder has 43 different folders named from 0 to 42. Each folder contains images from the respective class.
  • There are 39,209 train images and 12,630 test images. All images are 3-channel RGB-images.
  • There are also Train.csv and Test.csv files which contains information about the train and test images.
  • All the images are distributed in 43 different classes.
43 Classes of Traffic Signs
  • Train.csv file contains below information about each training image :
  1. Width : Width of image in number of pixels.
  2. Height : Height of image in number of pixels.
  3. Roi.X1 : Upper left X-coordinate of the bounding box.
  4. Roi.Y1 : Upper left Y-coordinate of the bounding box.
  5. Roi.X2 : Lower right X-coordinate of the bounding box.
  6. Roi.Y2 : Lower right Y-coordinate of the bounding box.
  7. ClassId : Class label of the image. It is an Integer between 0 and 43.
  8. Path : Path where the image is present in the Train folder. Its format is : “/Train/ClassId/image_name.png”.

2.2- Checking for class imbalance in training set :

  • Below plot shows the class distribution of the training images.
Class Distribution in training set

Observations :

  • The target classes are clearly not uniformly distributed.
  • This is quite logical since some signs like “Keep speed below 30 K-mph” or “A bump ahead” appears more often then signs like “Road under construction ahead”.
  • Images are unevenly distributed among each class. Some classes have around 2500 images while others have as low as 250 images.
  • Less number of images can undermine the training process. We can use Data-Augmentation to overcome the lack of adequate number of images in some classes. To learn more about data augmentation, refer this link.

2.3- Checking for class imbalance in test set :

Class Distribution in test set

Observations :

  • The observations for test data are quite similar to that of the train data.
  • Here also some classes are more frequent as compared to the others.

2.4- Analyzing image dimensions :

  • Below histograms shows the height and width distributions of train and test set images :
Height and Width distributions of training images shows that majority of images have their dimensions between 30 and 50 pixels.
Pie-Chart shows that majority of images have dimensions less than 64.

Observations :

  • It is evident from the above figures that both our train and test set images follow similar Height and Width distribution.
  • Since all the images have different dimensions, we need to fix the height and width constant for every image to follow.
  • We need to perform this in such a way that data loss is least.
  • For smaller images, we will need to do appropriate padding.
  • For this case study, we will resize our images to have height and width of (32, 32) since majority of . We will discuss how to resize images in the later part of this post.
  • Also note that it is better to truncate an image than to add more noise to it. Hence, keep the dimensions as low as possible such that majority of images have their original dimensions in its vicinity but not too low either.

2.5- Let’s also look at some images now :

  • Images in our datasets have different sizes as can be seen in below picture :
Images have different width and heights. Some are fairly big and others are as small as a postal stamp.
  • Some of them are bright while some of them are dark.
Some images are shot in proper daylight while some of them appear to be shot in dark and difficult to recognize.
  • Some images are sharp while some of them appear to be blurry.

3- Setting up data input pipeline :

3.1- Importing libraries and setting random seeds :

  • Let’s begin by importing the essential libraries and setting random seed for python, numpy and tensorflow.

3.2- Downloading data-set from kaggle :

  • We download the data-set from kaggle to local memory on google-colab. You can also download it to your google drive but then fetching those images during training will be slow. Hence, i keep them in local memory on colab.
  • Upon executing the above function, you will be prompted to upload the kaggle api token after which the data-set will be downloaded and unzipped.

3.3- Dividing Training data into Train and Validation sets :

  • We will use 25% of images for cross-validation purpose while training our model. Below is the code to create validation set.
  • Now, we need to set some constants like height and width of each image. We need to have same height and width of each image before feeding them to the neural network.
  • As we have discussed in Exploratory data analysis section, we will set width and height to (32, 32).
  • Also, we take the paths and names of each image in train and validation sets.
  • Now, we extract the labels for each image from its path.

3.4- Updating image bounding box coordinates :

  • Since we are reshaping each image to (32, 32), we need to update the coordinates as well.
  • Note that these coordinates are pixel numbers and not image ratios. Hence, updating them is quite easy. We need to do simple additions/subtractions. Below code will make it clear :
  • Now we will split data-frame into training and validation data frames :

3.5- Creating data generators :

  • Below is the function to create and return data generators. It is very important to understand it properly step-by-step.
  1. First, we read the image, decode it and convert datatype to ‘float32'.
  2. Then we adjust brightness and contrast. We will enhance brightness of only around 5% darkest images. Below is an example of image before and after enhancing brightness :
Images Before and after enhancing brightness.

3. In the next step, we will resize the image to (32, 32) and normalize it.

4. Note that our data-set returns image, its label and bounding box coordinates.

5. We are using “tensorflow.data” api to generate data-generators. It is a Tensorflow api to generate complex input pipelines. To learn more about it, please refer this link.

4- Model definition and training :

4.1- Defining model :

  • The model that we are going to define is capable of both classification as well as bounding box regression.
  • The “shapen-layer” in the code above sharpens the edges of the objects in images. Below image shows some examples of before and after sharpening of images.
Images on the left side are original images while those on the right side are sharpened images. We can clearly see that it strengthens the edges of objects present in the image.
  • Rest of the model is a combination of convolutional, max-pool and fully connected layers.
  • “classification” layer returns appropriate class label while the “regression” layer returns the upper left and lower right x and y coordinates of the predicted bounding box.
  • Below image shows the model plot :
Model plot

4.2- Training :

  • Now that we have our model in place, let us define the loss functions. Note that there will be 2 different loss functions. For classification, we will use “SparseCategoricalCrossentropy” and for bounding box regression, we will use “R-Squared”.
  • As discussed above, we will judge the classification performance of our model on weighted F1-Score. We will define a custom callback to print F1-Score at the end of each epoch. Below is the code to do the same and also for training the model.
  • Below image shows the Tensorboard plots. To learn more about Tensorboard, refer this link.
Above plots shows that there is no over-fitting since there is no major difference between training and validation losses.

5- Predicting output on test data :

  • Now let us evaluate our model on unseen test data images. There is a Test folder in our data set that contains 12,630 test images.
  • “Test.csv” file contains path to the test images as well as their actual labels and bounding box coordinates. Note that we will have to update the coordinates here as well like we did for training and validation images since we are resizing each image to (32, 32). I am not going to repeat the code since i have already attached that code above.
  • Below is the function to evaluate the test-set images. It returns predicted labels and bounding boxes for each test set images.
  • Now let us look at test-set performance of our model :
Test performance of our model
  • Below images plot predicted and actual bounding boxes :
Actual bounding boxes are red in color while the greens are the predicted ones. We can clearly see that they almost overlap each other.
  • Now let us also look at some of those images whose labels where miss-classified by our model :
Miss-classified images

Observation :

  • Less than 2.5% of the images are miss-classified.
  • Most of the miss-classified images are very much small in dimension and are very much out of focus.
  • Some of them are not even in frame.
  • Many of them cannot be recognized even by humans.

6- Training Faster-RCNN :

  • Let us try an advanced model for object detection purpose. With our previous model, we got 1.27 MSE which is quite good and the predicted bounding boxes were also almost perfect. But there are some state of the art models that excel at object detection problem. Let us use one such model for bounding box problem.
  • Faster-RCNN is the most widely used state of the art version of the RCNN family. In this blog, i am not going in much detail about Faster-RCNN architecture. To learn more about it, follow this link.
  • Let us now train Faster-RCNN on our data-set. First download the data-set, create validation directory and setup the input pipeline like we did before. There is no need to resize images in this case and hence, the coordinates need not updated. Then we need to install some dependencies.
!apt-get install protobuf-compiler python-pil python-lxml python-tk
!pip install Cython
  • First create a folder with name “Desktop”. This will be our project directory. Clone this Git-hub repository into it.
!mkdir "/content/Desktop"
%cd "/content/Desktop"
!git clone https://github.com/tensorflow/models.git
  • Then create two folders in object_detection folder as below. “images” folder will hold all the data and “training” folder will have necessary files for training that we will see next.
!mkdir "/content/Desktop/models/research/object_detection/images"
!mkdir "/content/Desktop/models/research/object_detection/training"
  • Move all the data folders in to the “object_detection/images” folder.
!mv "/content/Data/Train" "/content/Desktop/models/research/object_detection/images/"!mv "/content/Data/Validation" "/content/Desktop/models/research/object_detection/images/"!mv "/content/Data/Test" "/content/Desktop/models/research/object_detection/images/"
  • Now run below command :
!apt-get install protobuf-compiler python-pil python-lxml python-tk
!pip install Cython
  • Now change working directory to research folder and run below set of commands :
%cd "/content/Desktop/models/research"
!protoc object_detection/protos/*.proto --python_out=.
#Setting enviornment variable
os.environ['PYTHONPATH'] += ':/content/Desktop/models/research/:/content/Desktop/models/research/slim'
  • Now run “setup.py” file with following arguments :
!python setup.py build
!python setup.py install
  • Now install “tf_slim”, change directory to object_detection/builders and run model_builder.py file :
!pip install tf_slim
%cd /content/Desktop/models/research/object_detection/builders/
!python model_builder_test.py
  • Now create a file with name “generate_tfrecord.py” and copy the following code into it and copy it into object_detection folder :
  • Create a label map file with name “label_map.pbtxt” and copy label mapping as below into that file :
  • Move model config file into the object_detection/training folder :
!cp -r /content/Desktop/models/research/object_detection/samples/ configs/faster_rcnn_resnet101.config.config /content/Desktop/models/research/object_detection/training
  • In the config file, we need to edit some code sections. For now, simple copy below code into it. config file is quite straight forward to understand and you can easily tweak any parameters if you want.
  • Now we have to create train, validation and test records. Change directory to object_detection folder and run the below commands :
%cd "/content/Desktop/models/research/object_detection"!python generate_tfrecord.py --label='GTSRB' --csv_input=data/train_labels.csv --img_path=images/Train  --output_path=training/train.record!python generate_tfrecord.py --label='GTSRB' --csv_input=data/validation_labels.csv --img_path=images/Validation  --output_path=training/validation.record!python generate_tfrecord.py --label='GTSRB' --csv_input=data/test_labels.csv --img_path=images/Test  --output_path=training/test.record
  • Download Faster-RCNN model from tensorflow.org and unzip the zip file :
!wget http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz!tar -xvf faster_rcnn_resnet101_coco_2018_01_28.tar.gz
  • Move “train.py” and “eval.py” files into the object_detection folder :
!mv "/content/Desktop/models/research/object_detection/legacy/train.py" "/content/Desktop/models/research/object_detection/"!mv "/content/Desktop/models/research/object_detection/legacy/eval.py" "/content/Desktop/models/research/object_detection/"
  • Now, it’s time to train our model. I am going to train it for 3000 steps.
!python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_resnet101.config

Note : If there is any thing that is not clear regarding training of Faster-RCNN, then i have uploaded IPython notebook on this git-hub repository with fully functional step-by-step code that you can you for training and inference.

7- Combining Faster-RCNN with our custom model :

  • Now as a final solution, we are going to combine both our models. We will use our custom built model for classification purpose but discard its bounding box outputs. We will use Faster-RCNN for bounding box predictions. Below diagram depicts our final solution architecture :
Final Solution Diagram
  • Run below code first to import inference graph and set some paths and variables :
# This is needed since the notebook is stored in the #object_detection folder.
sys.path.append("..")
#Importing essential libraries
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
### Model preparation variable
MODEL_NAME = 'trained_inference_graph'
PATH_TO_FROZEN_GRAPH = '/content/drive/My Drive/CaseStudy2/frozen_inference_graph.pb'
PATH_TO_LABELS = 'training/label_map.pbtxt'
NUM_CLASSES = 1 #remember number of objects you are training? cool.
### Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
###Loading label map
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
### Load image into numpy function
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
###STATING THE PATH TO IMAGES TO BE TESTED
PATH_TO_TEST_IMAGES_DIR = 'test_images/'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.png'.format(i)) for i in range(1, 10) ]TEST_IMAGE_PATHS = test_df.iloc[:, 0]
IMAGE_SIZE = (256, 256)
  • Load classification model :
#Loading classification modelIMG_WIDTH, IMG_HEIGHT = 32, 32
N_CHANNELS = 3
N_CLASSES = 43
object_detection_model = u.get_model(IMG_WIDTH, IMG_HEIGHT, N_CHANNELS, N_CLASSES)object_detection_model_path = "/content/drive/My Drive/CaseStudy2/BestScoreTillNow.h5"object_detection_model.load_weights(object_detection_model_path)
object_detection_model.compile()
  • Below functions takes image paths as input and returns predicted class labels using custom object detection model and bounding box using Faster-RCNN model :
  • Test Mean-Squared error obtained for bounding box detection is much better compared to our previous model. Below are our final scores :
Final F1-Score and MSE

8- Conclusion and future works :

  • Faster-RCNN performs better at object detection. However it is not that good for classification purpose. Hence, we are combining our custom model with Faster-RCNN.
  • Data-Augmentation for better generalization.
  • Try different models like RetinaNet and yolov3 as a part of future work.
  • Try different image pre-processing techniques for miss-classified images.
  • We can also use models like vgg-16 to get the feature maps and pass it to our model.
  • And the list goes on…

♦ If you like my work then please clap on this post.

♦ Connect with me on LinkedIn.

♦ My Github profile.

--

--