Behavioral Cloning Project of Self Driving Car Nano Degree

Here I’ll go over how I made the car drive around on both the tracks of Udacity simulator as part of the Behavioral Cloning project of Udacity Self Driving Car Nano-Degree program. I took a minimalistic approach to make the car drive around track1 and track2. I kept the goal of using only the data from track1 for training yet making sure the model is able to generalize to drive around track2 as well.


We used the same model used in Nvidia’s End To End driving. The model takes a 66 x 200 image and predicts the desired steering angle of the car. In summary, the model has 5 convolution layers followed by 5 fully connected (FC) layers and altogether about 1.6 million parameters. We added dropouts with probability 0.5 to the first 3 FC layers.


  1. Udacity Dataset: Primarily we used Udacity provided sample training data.
  2. Sharp Turn Dataset: We created a small dataset to train the car to take sharp turns.

Both these datasets are generated from the track1. As mentioned my goal was to train only on track1 and yet generalize to track2.

Data Analysis

Each dataset has images captured from 3 cameras (left, center and right) and corresponding steering angle.

Udacity dataset has 8037 data points. Each data point is a steering angle and corresponding 3 images captured by the 3 cameras. Udacity dataset captures a perfect driving scenario on the track1 where the car is mostly driving straight. This is more obvious from the histogram of the steering angles as we can see most of the steering angle is zero.

Histogram of steering angle for the Udacity Data

Udacity data is good enough to train the model to drive around track1. But the car needs to take some sharp turns on track2 which a model cannot learn from the Udacity data. That’s why we created the sharp turn dataset where the car is driving away from the curb or fence. In particular, we positioned the car very close to a curb or a fence and then start recording while it takes sharp turn to drive away from the curb/fence.

Total number of data points in this dataset is 738. The full video of this training data is as follows and the dataset is available for download.

Compare this histogram with the histogram of the steering angle for the Udacity dataset. In this dataset, almost 50% of the data points have close to -1.0 or 1.0 steering angles.

Solution Design Approach

  1. Baseline
  • Only the center images of the Udacity dataset is used
  • Only preprocessing used is resizing the image to 66x200

The MSE loss observed is as follows


  • The loss is decreasing very slowly. This is because the pixel values of the input image ranges between 0 and 255. The activation of the final layer is tanh. Large values of input make the output to stay close to -1.0 or 1.0 where the derivative is close to zero. So at each iterations the parameters are updated with small changes. Therefore the convergence is slow.
  • Training loss is fluctuating a lot. The model is very unstable.

Driving Status:

The car goes out of the road within few seconds of driving. It is not able to take any turn.

2. Image Cropping and Normalization

Image Cropping: The upper part of the images show the sky and trees, not relevant to driving decisions. On the other hand few rows at the bottom of the image mostly show the bonnet of the car. So we crop the input image from 56 to 150 rows

img_crop = img[56:150, :, :]

Image Normalization: We apply per-channel normalization by subtracting the mean and dividing by standard deviation of each channel. This will change most pixel values within -1.0 to 1.0, addressing the problem of large input values we saw earlier in case of baseline.

def normalize_image(img):
means = np.mean(img, axis=(0, 1))
means = means[None,:]
std = np.std(img, axis=(0, 1))
std = std[None,:]
return (img - means) / std

After applying cropping and normalization the MSE loss vs epoch curve look as follows


  • Loss decreases much faster than Baseline. Validation loss is below 0.010 within 10 epochs. In case of baseline even after 30 epochs it was above 0.010
  • Loss is lower than baseline, we achieved lower than 0.010 validation loss.
  • The validation loss starts increasing after epochs 16–17 but the training loss continues to decrease. So the model is overfitting the training data.

Driving Status:

  • The care has now learnt to take few turns
  • It is able to cross the bridge on the track1 and after that it goes off the road

Overall a very good progress from the baseline

3. Use Left/Right Images

In the above experiments, we faced the problem of overfitting. One way to address the problem of overfitting is by including more training data. Particularly inclusion of lots of noisy data in the training set helps in reducing the effect of overfitting.

As mentioned earlier when recording training data, the images are captured using 3 cameras, left, center and right. During actual driving in the simulator, only the center camera images are available. But we can train the model using images from all 3 cameras even though actual driving will be done only using the center camera images. In fact, since left and right images looks little bit different than the center image they are perfectly suited as noisy training examples to reduce overfitting.

Directly using the left and right images as training data is not a good idea as in that case the model will be trained mostly on different types of images than what it will see during actual driving. We shift the steering angle by +0.25 for the left images and by -0.25 for the right images. The idea is that we shift the steering angle for left and right images so that the center camera sees the same image as seen by the corresponding left or right camera. For example, we assume when the car is turned additional +0.25 radian towards right it will see the same image as seen by the left camera in the current position.

After the image augmentation we literally triple our dataset. In the Udacity data we have 8037 data points, 7K in the training set after train-validation split. After augmenting left/right images the training set contains 21K images. Note that we didn’t change in the validation set as we continue to use only the center images for validation.

The MSE loss including left and right images are as follows

MSE Loss at 30 epochs
MSE loss at 50 epochs


  • We achieved lower validation loss than what we achieved earlier. The validation loss is lower than 0.010 now
  • Validation loss is more stable and the impact of overfitting is less noticeable.
  • Validation loss still slightly increased from epoch 15 to 30. So there is still some overfitting going on

Driving Status:

  • The car is now able to successfully drive around track1.
  • It still fails to drive around track2. The car hits the side rock on a sharp turn. Next we will train the car to learn to take sharp turns.

3. Include Sharp Turn Data

We have shown earlier that the Udacity data is heavily skewed towards zero steering angle. That’s why we have created the sharp turn data which shows examples of quickly moving the car away from the curb or fence. Next we train the model including this sharp turn data.


  • The sign of overfitting is not obvious. We can train the model for large number of iterations (typically 50 to 70 epochs). More number of iterations generates more stable model that does not greatly vary between different runs.
  • Validation loss is higher than what we saw earlier. This is due to the fact we have included many large steering angle data in the validation

Driving Status:

  • The model is able to successfully drive around the track1 and track2

Future Work

The car is still not able to drive in presence of shadow. We need to try image augmentation after changing brightness and adding shadows.