Behavioral Cloning (Udacity Self Driving Car Project) — Generator Bottleneck Problem in using GPU

Published in

Deep Learning Turkey

8 min readMar 11, 2018

This is a Udacity Self-Driving Car Nanodegree Project and all sources are belongs to Udacity. In this test I used my MacPro 15" with 2.9GHz CPU and Sonnet Breakaway Box with Geforce 1060 6GB GPU.

1. What you expect for this paper

How to use simulator data for training car autonomously like a real world.
How to train a CNN Model with and without a generator.
How important data preprocessing before training.
How to use data augmentation.
How affect generator in training phase using GPU. Bottleneck Problem.

2. Sources:

Project starter code link : https://github.com/udacity/CarND-Behavioral-Cloning-P3

Setup of environment for the project please download this link CarND Term1 Starter Kit. (I didn’t use this Kit that explained below.)

I use Conda (https://www.continuum.io/downloads) with Keras 2.1.4 and Tensorlow-gpu 1.5.0. I upload my ‘environment.yaml’ with simulator and data in below link. if you want to use my environment just run this code after downloading.

conda env create -f environment.yaml

Link for the environment.yaml, simulator and data: https://drive.google.com/drive/folders/1_a8tmhlsrWfp_vyi6uN_js6RqupJ8zFl?usp=sharing

3. Project Codes:

Before coding. I want to explain a bit of project. Data file which include images and driving_log.csv file that path of the images. Each frame has 3 images which comes from left camera, centre camera and right camera. So, when reading driving_log.csv file each row has ‘center, left, right, steering, throttle, brake, speed’ columns. In this project we use camera images (center, left, right) as input and steering as target.

At first import needed packages for the project.

3.1. Load Images

I use two function to load images, first one is read csv file and load by line by line. Return ‘lines[:1]’ because first line is header. Second one is load the images for using lines path. Steering (angle) is for centre image. So, if image is right image angle is a bit small value (-0.2), and right add a bit (+0.2).

As mentioned above, data file included ‘driving_log.csv’ file and ‘IMG’ folder.

First, I load the paths (not images) . And then split paths into two parts. One is for train and the other is for validation. You may asked, why not use test data? We don’t need to test images because it is a regression problem not classification.

Now load images in their paths.

Total train images (if loaded three camera images of frame), total samples of train is 19284 and validation is 4820.

Plotting these images give some intuition about data.

As you see data is not a normal distribution or gaussian. If we load this data to network, -0.2, 0, +0.2 degrees overweighted the other steering angles.

I think we have two options. First one is when I load the data to network I cut overweighted degrees (as mention above) make it all degrees have normal or gaussian distribution. It seems a good option, however when thinking machine learning intuition it is not a good option, why? Because we lost some of the data and load to network very low data might makes it overfitted.

Second option is generate a new, augmented data. It is like a opposite of the first option. Instead of cut the overweighted steering angles, this time we augmented the data that the other steering angles have more that and get nearly gaussian distribution. So I choose this option.

3.2. Data Augmentation

How to augment data, I have some options:

-Load left and right images (Like did it above),

-Make some images flip,

-Make some images brighten.

Also I have more options that like scale, zoom or crop images. However this are change the shape of the image that I don’t need on this regression project. Although Keras have data function for augmentation process (https://keras.io/preprocessing/image/), I create my data augmentation function.

At this point we examine problem in two different manner, first one include generator the other not. Generators lets low memory usage. For more discussion for it is usage check this link: https://stackoverflow.com/questions/1756096/understanding-generators-in-python

3.2.1. Data Augmentation with Generator

Randomly take center, right or left image and each image again randomly augmented it brighten or flipped.

As it is a generator function we make a generator for use it.

Make some samples to plot. Now I select batch size is 1024 that generator generates 1024 images for train and validation.

Now, let plot images as histogram again.

It is not perfect but usable than its raw version.

3.2.2. Data Augmentation without Generator

Data augmentation part is nearly same as generator. With generator sample images are adjusted in train phase (we explain below), but without generator we should adjust it here.

I augmented nearly 14.100 image in 115 sn. Of course I use much more memory this time.

Let’s also plot images.

It looks gaussian. Nice because it makes car more straight drive. We can improve it but I think it is enough for training.

3.3. Model Selection

I used classical CNN architecture that include;

First, crop unneeded part of the image (Because, at the top of all the images are sky-mountain-forest and the hood of the car at the bottom),
Resize image to a small one like a Le-Net architecture,
Normalize the data,
Use 5 convolution layers with batch normalization and activation ELU,
1 Flatten layer and 3 dense layer with activation ELU and using dropout to not overfit the network,
Optimizer Adam with learning rate 1e-5 instead of default 1e-3.

This is the model that we are going to use;

So input image shape is 60x320x3 and output is just a steering prediction.

Total parameter is more than 2 million. But it is not a big number if you use GPU.

3.4. Train Model

We train model in same model with/without generator.

3.4.1. Train Model with Generator

Model train is a bit different when using generator. (https://keras.io/models/sequential/)

//Instead of using
model.fit()
// Use
model.fit_generator()

In training phase I setup ‘steps_per_epoch= len(train_samples)’ that means 6500 images used in this train. if you want to train more; like make it double which are nearly 13000 images, just multiply ‘2*len(train_samples)’. Don’t forget that without generator more than 14000 training images.

Because it takes more time to train with generator, I entered only ‘1’ to ‘epochs’ (epochs =1). Oh my God! It is more than what I predicted. Just one epoch takes more than 12 minutes with nearly 6500 images (In my first try it takes 34 minutes!!!).

As to it is a regression problem accuracy is trivial, aim is minimize the loss.

3.4.2. Train Model without Generator

I entered 50 for epochs (epochs=50). Each epoch takes nearly 20 sec. and total training time is nearly 17 minutes.

As to it is a regression problem accuracy is trivial, aim is minimize the loss.

4. Results After Training

To run simulator run this code;

python drive.py model.h5

The result of full training after preprocessing and data augmentation;

The result of full training no preprocessing (except cropping and normalization in model). It is like a drunk car and at the end exit from the route :

5. Conclusion

In this paper we take a simulator and data which is prepared for Udacity Self-Driving Car Nanodegree and train it in CNN architecture as a regression problem.

We train the model with/without generator. Compare to with generator, without generator have double images and epoch/sec is 20sec. which is 720sec. in with generator. It means without generator is nearly 72 times faster than with generator. (Sometimes it goes up 200 times!!!).

The problem is also about ‘time efficiency’ versus ‘memory or space efficiency’ problem. If have enough memory to allocate and want to effective usage of GPU selection without generator is understandable. However, training a huge data not possible to allocate all in memory makes to use generator.

So in this data and model that using generator with GPU, bottleneck problem occurred. Of course it depends on data size, batches, model, generator function and computer CPUs-GPUs. It is just a example illustration of course it is more complex than that.