Traffic Sign Classifier

Udacity self driving car Nanodegree Project 2

Image result for traffic sign classifier image


This project is all about Deep learning with Conv Nets.. yes, Lots of Data, computing on GPUs (CPUs are too slow), endless training and finding the optimal condition between generalization and over fitting.

Data Set

For this project dataset is used form German Traffic Sign Recognition Benchmark(GTSRB). This data set is cropped traffic signs from real images of different conditions lighting, perspective, weather etc.

The project is implemented with jupyter notebook provided by udacity with some filler code and dataset available in pickle format sub classified as train, validation and test sets

The top level steps involved for traffic sign classifier project includes:

  1. Load and Explore Dataset
  2. Preprocess Data
  3. Build and train network
  4. Test Network on test dataset
  5. Test on new Images from real world images

Explore Dataset

The data set consists of 43 classes of traffic signs with varied number of images for each class

No of images vs classes

As noticed from the above the the number of samples are not uniform for all classes. There are few methods suggested in journal article by Pierre Sermanet and Yann LeCun to augment the data

  1. Jitter the data with image transformations (scale, warp, brightness, translate)

2. Flip the images horizontal, vertical, or both and update labels accordingly

Preprocess Data

The input data is a 32x32x3(RGB) format. some of the images are washed out with high brightness, and low contrast. So these images are converted to gray scale and and equalized with local histograms. The input images are normalized to -1 to +1 range as the activations will be saturated with 0–255 range.

Image Preprocessing

Other option is to use 1x1 convolution on the input layer, so that the network can choose the color space.

Network Architecture

A convolution neural network is implemented with the architecture similar to LeNet with input of 32x32x3 and 1x1x3 convolution is implemented at input layer to convert the RGB to network tunable color space.

Network architecture

The Network consisted of 3 conv layer followed by two fully connected layers flattened from each conv layer to fully connected layer.

The Network is trained with the following regularizations

  1. 2x2 Max pooling with 2x2 stide
  2. Increasing Drop out percentage as the layer deepens to drop out during the higher features.
  3. additionally L2 regularization can be used to avoid overfitting

Training on Test dataset

The network took 15 mins to run 60 epochs on Nvidia 1050Ti GPU and achieved a validation accuracy of 95.85%


The network also performed at 95.85% on Test data with the error number as below with respect to the classes.

No of errors vs classes

In order to improve the test accuracy to the levels of published in journal by Sermanet and Lecun, the jitter dataset can be generated to increase the accuracy.

Test on New Images

So how does this network perform on new images. i have choose 7 inputs cropped out the traffic signs from real images downloaded from Internet

New Test Images with labels

Test images results

Some observations:

  1. The test results are all accurate however 100% confidence suggests a bit of overfit and needs a L2 regularization
  2. when the test images are jittered the accuracy percentage dropped to 80% with failure images having a confidence of 100%.
  3. The network needs to be trained with augmented dataset to improve the accuracy with variations in input

My Network implementation can be found on github.

Like what you read? Give Lokesh Korapati a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.