Teaching a car to mimic your driving behaviour

An demonstration of using deep networks for behavioral cloning

Subodh Malgonde
5 min readJan 10, 2017

This post is about a very interesting project I did as a part of Udacity's Self Driving Car Nano Degree. The task was to train a car to drive by itself in a simulated environment (like a video game). The idea is to first use the simulator in the manual mode, where the simulator records data like camera images, steering angle, speed, etc. as you drive down the track. Then use this data to train a deep neural network to drive the car when provided with a new input feed from the cameras. To keep things simple only the steering angle needed to be predicted.

Screenshot of the simulator used in this project

If you just want to jump straight to the demo video- here is the YouTube link.

About the setup

The car is equipped with 3 cameras at the front. One each on the left, right and center. The left and right cameras point straight, along the length of the car.

Images from 3 cameras

Each image is a 160 by 320 color image. Here are some sample images recorded by the cameras.

Sample images from the dataset

Training for recovery

As you can imagine, if you record the training data driving through the center of the track, then most of your recorded steering angles will be close to zero. This data will have bias for driving straight. Since the training track has more left turns than right, it also had a slight bias for turing left.

To overcome this bias you could employ 2 approaches. One of the approach is to deliberately drive towards the edge of the track and record the data while you make the car recover to the center of the track. This requires high precision. One needs to ensure that only the recovery is recorded and not the part where you drive towards the edge. Sounds easy but is very difficult to execute in practice. I tried recording recovery data multiple times but could only manage to make the car wobble and veer off the track.

The other approach is to use the images from the left and right cameras to simulate recovery. You can do this because the left and right camera images are like parallel transformations of the car. The main idea being the left camera has to move right to get to the center, and right camera has to move left. So you could take the left camera image and add a constant value to steering (positive steering implies turing right) and do the opposite for the right camera image. The second approach is well demonstrated in this paper by NVIDIA. This approach is simple, can be implemented in code and it worked really well for me!

Getting more data

One of the issues with working on deep neural networks is that they require a lot of training data. For complex problems it could be millions of data points. When trained on a small data the network tends to overfit i.e. it fails to generalise the solution to new data. To get more data one could either record more training data by driving more laps on the track or generate new data by augmenting the existing data.

The first approach has its limitations since a human can only drive so many laps without introducing errors in the dataset. The second approach -augmentation- is simple, can be implemented in code and one can practically generate an infinite amount of training data. I have to thank this fantastic blog post for providing ideas for augmentation techniques for this problem.

I used only 2 augmentation techniques, since they were sufficient for this problem:

  1. Flipping the images horizontally
    Since the dataset has a lot more images with the car turning left than right(because there are more left turns in the track), you can flip the images horizontally to simulate turing right and also reverse the corresponding steering angle.
  2. Brightness Adjustment
    In this you adjust the brightness of the image to simulate driving in different lighting conditions
Augmentation Techniques

The sub-image on the top left in the above image, is from the center camera. The one to its right is from the left camera with an adjusted steering angle. The one on the bottom left if the horizontally flipped version of the center camera, with its steering angle reversed. The last one is a brightness adjusted image (random brightness adjustment).

Training the Network

I used a 5 layer neural network for this project. The technical details of my neural network and code can be found in this github repository.

Before starting this project I had imagined that I would need powerful hardware, with GPUs and lots of RAM, to train my neural network but as it turned out I could just use my Macbook to train it. It took only 15 minutes for my network to converge to a solution.

Network Performance

Here is the video of the car driving itself on the track. The numbers you see on the right are the steering angles and throttle values sent by the code to the car.

Evaluation Video

Performance on an unseen track

The true test of a neural network is how well it performs on unseen data i.e. data not used in training. To evaluate this the simulator had a second track which was very different from the one used for training. It was darker, had slopes while the first track was more or less flat, had sharper turns and more right turns as compared to the first track. The network had never seen data from this track. However some of these differences were accounted for in the network due to image augmentation techniques described above.

Here is a video of the car driving itself on the second track.

Code available at https://github.com/subodh-malgonde/behavioral-cloning

--

--