Traffic Sign Classifier
Udacity self driving car Nanodegree Project 2
This project is all about Deep learning with Conv Nets.. yes, Lots of Data, computing on GPUs (CPUs are too slow), endless training and finding the optimal condition between generalization and over fitting.
For this project dataset is used form German Traffic Sign Recognition Benchmark(GTSRB). This data set is cropped traffic signs from real images of different conditions lighting, perspective, weather etc.
The project is implemented with jupyter notebook provided by udacity with some filler code and dataset available in pickle format sub classified as train, validation and test sets
The top level steps involved for traffic sign classifier project includes:
- Load and Explore Dataset
- Preprocess Data
- Build and train network
- Test Network on test dataset
- Test on new Images from real world images
The data set consists of 43 classes of traffic signs with varied number of images for each class
As noticed from the above the the number of samples are not uniform for all classes. There are few methods suggested in journal article by Pierre Sermanet and Yann LeCun to augment the data
- Jitter the data with image transformations (scale, warp, brightness, translate)
2. Flip the images horizontal, vertical, or both and update labels accordingly
The input data is a 32x32x3(RGB) format. some of the images are washed out with high brightness, and low contrast. So these images are converted to gray scale and and equalized with local histograms. The input images are normalized to -1 to +1 range as the activations will be saturated with 0–255 range.
Other option is to use 1x1 convolution on the input layer, so that the network can choose the color space.
A convolution neural network is implemented with the architecture similar to LeNet with input of 32x32x3 and 1x1x3 convolution is implemented at input layer to convert the RGB to network tunable color space.
The Network consisted of 3 conv layer followed by two fully connected layers flattened from each conv layer to fully connected layer.
The Network is trained with the following regularizations
- 2x2 Max pooling with 2x2 stide
- Increasing Drop out percentage as the layer deepens to drop out during the higher features.
- additionally L2 regularization can be used to avoid overfitting
Training on Test dataset
The network took 15 mins to run 60 epochs on Nvidia 1050Ti GPU and achieved a validation accuracy of 95.85%
The network also performed at 95.85% on Test data with the error number as below with respect to the classes.
In order to improve the test accuracy to the levels of published in journal by Sermanet and Lecun, the jitter dataset can be generated to increase the accuracy.
Test on New Images
So how does this network perform on new images. i have choose 7 inputs cropped out the traffic signs from real images downloaded from Internet
Test images results
- The test results are all accurate however 100% confidence suggests a bit of overfit and needs a L2 regularization
- when the test images are jittered the accuracy percentage dropped to 80% with failure images having a confidence of 100%.
- The network needs to be trained with augmented dataset to improve the accuracy with variations in input
My Network implementation can be found on github.