Behavioral Cloning For Self Driving Cars

8 min readJan 9, 2017

Project Goals

In this project we want to design and develop a deep learning model to mimic driving behavior. The inputs are images and the outputs are control commands like steering angle. For simplicity we are only predicting steering angle.

Simulator

You can download the simulator for various platforms here:

Once you’ve downloaded, extract it and run it.

My Objectives and Goals

Limited Model Size
Utilize Small Network
Works for both tracks

Data

First I generated a large dataset from the simulator with my keyboard, but that dataset was not perfect and there were a lot of noisy samples. My model was not good. After that I wrote code to filter the dataset and only keep the best samples for training.

%matplotlib inline
import time
import pylab as pl
from IPython import displayfor idx in range(10,len(drivingLog),20): # init 0/10 
    test = drivingLog['Center'][idx]
    testLabel = drivingLog['Steering Angle'][idx]
    image = loadImg(test, True) 
    cmap=plt.get_cmap('gray'))
    pl.imshow(image)
    display.clear_output(wait=True)
    display.display(pl.gcf())
    print(testLabel, idx)
    time.sleep(0.02)

However I found Udacity’s dataset useful for training and achieving better performance. I used their data in my last model. Here you can download the train and test data that I used to train that model.

Data Visualization

Always the first step to train a model for an specific dataset is visualizing the dataset itself. There are many visualization techniques that can be used but I chose the most straightforward option here.

This figure shows steering angle, throttle, brake and speed for the training set. In the following figures you can see more information about the training set.

In the test set track 2 was used to validate the model trained on track 1, and as you can see in the results, the model performed very well even though it hadn’t seen this track before.

These are the same image which used as an input for training the model.In the following figure I show some examples from the dataset after loading an image and doing some pre-processing like cropping and other common techniques. These are the same images used as input for training the model.

Preprocess

As you can see in the figures above, the main problem here is an imbalanced dataset. If you train any model with this dataset then you will just over-fit and the model will tend to predict zero and drive straight forward. So we really need to figure out this problem. To fix that I looked for a thermal and probability variable to force generator generate a balanced dataset for the model. However with that approach controlling the parameters was hard so I decide to change that to a better and more simple approach. I filtered out the dataset and I keep only 25 percent of zero samples. Here we face to a trade-off and if we keep less or more zero samples depend to the network we will have a more spiky or more straight driving. Actually I design the generator somehow to augment data when we call it. I changed many of my parameters when I see better approach by other people. However to see the final distribution of data I test it using a simple approach. I call my generator for a specific time and then I measured the labels and plot the histogram. You can see the result here:

The Distribution of samples from Train Generator

As you can see, the figure above shows normal distribution which is necessary for our model to see the steering angles in an approach like Gaussian distribution because we want to show the model a steering angle and it’s importance. So it is really important to augment your data somehow to generate a fair dataset. I was contemplating how to do this but Vivek showed me the right way. Here you can find the code:

rows, cols, _ = image.shape
transRange = 100
numPixels = 10
valPixels = 0.4
transX = transRange * np.random.uniform() - transRange/2
steeringAngle = steeringAngle + transX/transRange * 2 * valPixels
transY = numPixels * np.random.uniform() - numPixels/2
transMat = np.float32([[1,0, transX], [0,1, transY]])
image = cv2.warpAffine(image, transMat, (cols, rows))

I can say this is the most important part of the code for augmentation. In the code above we want to generate a new image from the current image using random shifts in the horizontal direction. The change in the steering angle value is 0.4 per pixel and we will limit the change up to 10 pixels and this is important because we don’t want to show the model something impossible or a black image.

To show the network equal amounts of left and right turn steering angle. I used 50 percent probability in my generator to generate an image with the right labels or flipped the image and changed the sign of the steering angle.

if np.random.rand() > 0.5: # 50 percent chance to see the right angle
    image = cv2.flip(image,1)
    steeringAngle = -steeringAngle

It is important for network to have idea about light so I used two approach to show network more images with different brightness. First I tried to normalize the brightness and find the best brightness invariant image using computer vision technique (Future ToDO(better implementation)) but the easier approach was to add some random value to the V channel of HSV image which is very common in vision community.

After applying above augmentation the generator will generate samples similar to the the following figure:

Architecture

To design the model I first start from a very simple model. One of my objective was to design a very small network and powerful enough to solve this problem. I really need to visualize the output of the model layers to see what network can see so I wrote a code to visualize the layer in Keras which is very simple thanks to Keras most part was implemented before. According to the output I realize that which network is better and this visualization help me to achieve this network.

Here you can see the model and output of the each layer:

# visualize model layers output
from keras import backend as K
layerOutput = K.function([model.layers[0].input, K.learning_phase()],[model.layers[2].output])
idx = 4000 # this can be anything or even random
test = drivingLog['Center'][idx]
testLabel = drivingLog['Steering Angle'][idx]
image = loadImg(test, True) 
# output in test mode = 0, train mode = 1
layerOutputSample = layerOutput([image.reshape(1,image.shape[0],image.shape[1],image.shape[2]), 1])[0]
layerOutputSample = layerOutputSample.reshape(layerOutputSample.shape[1],layerOutputSample.shape[2],layerOutputSample.shape[3])
print(layerOutputSample.shape)
figure = plt.figure(figsize=(24,8))
factors = [8,4]
for ind in range(layerOutputSample.shape[2]):
    img = figure.add_subplot(factors[0],factors[1],ind + 1)
    #plt.subplot(4, 4, ind + 1)
    val = layerOutputSample[:,:,ind]
    plt.axis("off")
    plt.imshow(val, cmap='gray',interpolation='nearest')

In the following figures you can see output of the first, middle and the last layer of convolutional part of my network for a specific sample:

Output of the first Convolution Layer After the convolutional color space changer

The trainable parameters in my network is 345,645 which is very good and you can fine tune this network live even with the very limited resource.

The output of the last convolution layer

Train Process

The hardest part was to wait for training process specially when I had no access to a powerful GPU. Because I used Adam optimizer, I use this code to train my model in Keras:

from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, LambdaCallback, Callback
numTimes = 10
numEpoch = 10 # 4 + 2
thr = 0.0001 # 0.3
for time in range(numTimes):
    trainGenerator = generateBatch(XTrain, yTrain, batchSize=50, threshold=thr)
    validGenerator = generateBatchVal(XVal, yVal, batchSize=50)
    samplesPerEpoch = 32000 # len(yTrain)
    nbValSamples = 1000
    history = model.fit_generator(trainGenerator, samples_per_epoch=samplesPerEpoch, nb_epoch=numEpoch, validation_data=validGenerator,
                    nb_val_samples=nbValSamples, callbacks=[ModelCheckpoint(filepath="bestVal"+str(time)+".h5", verbose=1, save_best_only=True),
                    ReduceLROnPlateau(monitor="val_loss", factor=0.2, patience=2, min_lr=0.000001)])
    print(thr, 'Time ',time+1)
    thr += (1/(numTimes))

Results

First, I just keep the models that is good enough for validation set (above a threshold). To evaluate my trained model, I design a code to show me a random sequence with the ground truth and predicted steering angle and I justify the model manually before the run in the simulator. This approach is great if you check that in the ordered dataset according to time. Here you can see the output of this approach for train and test set:

Here you can see the results and performance of my model in the track 1 and 2, please click on the images if you want to see the video:

My model performance for Track 1

My model performance for Track 2

How to run?

Here you can find my github repo. for this project: https://github.com/mvpcom/Udacity-CarND-Project-3

If you want to use Docker, you can use next command:

docker run -it -p 4567:4567 -v ~/sdcnd/behavioral-cloning:/home/dockeruser budmitr/sdcnd-term1-cpu /bin/bash

Dockerfiles for CPU and GPU machines for SDCND can be found in this Github repository. This tutorial by Dmitrii Budylskii has detailed explanation of how to use these images.

It is so easy and straightforward, just make sure you installed everything and run this line of code:

python drive.py model.json

This is an alternative method which is useful if you want to train the model live and fine-tune it. Here you can find more information if you want to know more about live trainer. There is a great medium post from Thomas Anthony about the live trainer. However the only thing you need is to download everything from my repository. There is no need to install any separate package for the live trainer. Just use this command:

python liveTrain.py model.json

Acknowledgements

I really want to thanks Todd Gore, Vivek Yadav, Thomas Anthony and Dmitrii Budylskii during this project because I learn new things from them and they shared their resource to me specially Todd which let me to use his computer.

Notice:

Live Trainer is belong to Thomas Anthony
The latest Test set is belong to Dmitrii Budylskii
I really recommend you to check Vivek Yadav medium post