Hacktorial #1: Building a Self-Driving Car with Deep Learning

Sujith Vishwajith
Jan 16, 2018 · 11 min read
Google’s Self-Driving Car

This post is a written adaptation of the Terrapin Hackers Hacktorial presented at the University of Maryland. The original slides from the presentation are available to view here.

In this hacktorial, we will learn how to build deep learning models in Python to train a self-driving car from scratch using Udacity’s car simulator. This is originally a challenge from Udacity’s Self-Driving Car Engineer Nanodegree. Specifically, we will use a framework called end-to-end learning which I will discuss later on in more detail.

The Github repo containing all the code is located here.

Follow along on the Jupyter notebook Behavioral Cloning Self-Driving Car Guide which contains all the project code and explanations.

  1. Intro
  2. Goal
  3. Approach
  4. Installation
  5. Implementation

This hacktorial assumes you have some familiarity and experience with Python and Convolutional Neural Networks (CNNs). For those new to deep learning or computer vision, here are some good blog posts to give you a solid basis for understanding convolutional neural networks, max pooling, etc. Included is a sample tutorial which guides you through installing Keras and guides you through creating a CNN that recognizes handwritten digits.


Deep learning has been around since the 1980’s as a research interest of Geoffrey Hinton, Yann LeCun, and many others. The combination of an increased computational power and large storage capacity of modern day computers has allowed deep learning to gain immense success. In fact, deep learning has made its way somehow or another into many of our everyday technologies such as in our email, on search engines, hospitals, and more!

From Left to Right: Yann LeCun, Geoff Hinton, Yoshua Bengio, Andrew Ng

One area where deep learning is making a huge contribution is in computer vision and robotics. Without deep learning, scientists generally had to perform manual feature engineering which is a fancy term for manually identifying all the features that might be relevant towards the current task. As you can imagine, this was an extremely cumbersome and slow task that has many limitations. With deep learning however, computers could now teach themselves what features were important in the task at hand. For the case of image recognition, deep learning models learnt what combinations of shapes and features belonged to certain classes and allowed for a substantial improvement in image recognition. For more information on deep learning check out our some of our other posts.

Self-driving cars have always been a central vision in our future. With Google, Uber, and other tech-companies making huge advancements in the past couple years it’s not surprising if we see them on the road soon. There are many approaches developed over the past years to build these self-driving cars. While the ones developed by the companies highlighted above are massive complex systems, they often boil down to key algorithms and components. With the rapid democratization of AI, we too can harness the power of these algorithms and apply them to build our own self-driving cars (simulated…..for now).


The end goal is to build a machine learning model that can steer a car around a track without any human intervention. The model should ideally work on any test track and can start at any given location. We also want to avoid writing a lot of code to keep everything simple. Our model should take in images of what the car sees and output the correct steering angle (and maybe even the throttle).

By utilizing deep learning, we should also be able to avoid manual feature engineering in the code. It also has a lot of edge-cases where features are computed incorrectly or additional features are required.

Sample of our desired result. GIF from here.


There are many approaches to developing self-driving cars used by companies today. We will briefly go through an overview of some of the popular approaches and then select the best one suited for our goal (certain approaches are better than others for certain tasks). There are many other approaches out there which this article doesn’t discuss such as Inverse Reinforcement Learning and Adversarial Networks.

Hard-Coded Rules and Features

This was one of the first approaches to developing autonomous vehicles and probably what we all imagined it to be like. The rules go something like this: If you see a stop sign, stop the car. If you see a pedestrian, stop the car. If you are in a lane, make sure you are in the center, and on and on. The major problem with this is that there are too many rules to keep track of and write. In some scenarios (like an impeding accident), these rules don’t even apply. This approach also requires a lot of sensors to keep track of the car.

Good: You know exactly what the car is doing and why.

Bad: Too many rules and sensors required to keep track of.

Reinforcement Learning

This is one of the more popular approaches that have existed for a while but recently publicized by companies like DeepMind. The approach is an area of machine learning inspired by behaviourist psychology. It relies on letting an agent (the car) learn what actions to take based on a reward function (driving well). This approach often works extremely well and doesn’t require humans to teach it to drive. The issue is that in order to learn, the agent has to experiment with millions of actions to learn which is gives it the best reward in each scenario. Since we can’t let a car roam around a city crashing into people and only after realizing the mistake, we are forced to train it in a simulator. One of the biggest problems here is that

Simulations are doomed to succeed. — George Hotz

This means that no matter what, eventually our car is going to succeed at driving in the simulation we give it as the environment in the simulation is pre-programmed. This is a problem as we can never ensure that the simulation is real enough due to the amount of variance in the real world which we can’t control.

Good: Extremely good and optimal at driving in trained situations

Bad: ‘Simulation is doomed to succeed’

End-To-End Learning (Behavioral Cloning)

This approach is probably the simplest approach of them all. The idea behind it is that neural networks are extremely good at learning patterns. The question we want to answer is

If we showed a car how to drive, could it learn to drive like us?

This approach to self-driving cars was recently successfully implemented by Nvidia in their paper End to End Learning for Self-Driving Cars. Since this paper is a core foundation of the rest of the tutorial, it’s recommended you give it a quick read.

The intuition behind this approach is that a user drives a car around for a long period of time in ideally diverse scenarios (e.g. a city, freeway, etc.) and collects data about their driving (usually images of what’s in front of the car, speed, acceleration, etc.). Then we take the data and feed it into a neural network and try to let it learn why you took the specified action for each data point. For example if you consistently turned when the road was curved, it would learn that an image of a road curving in front of you should mean turn in the direction of curve. An issue here is that the model will always learn to drive like you drive. So if you decided to drive like an uber-aggressive New York driver; the model will too!

Good: Fast to train, easy to code, and generally works well.

Bad: Data focused and lack of interpretability.

Since our simulator lets us drive the car around different tracks ourselves and collect data from our run, the end-to-end learning approach works best for us.


Let’s get our hands dirty. First, let’s install everything we need in order to run and develop this project.

To run the models and code make sure Python is installed. All the setup should ideally be done inside a Conda environment as well as to avoid any conflicts when running the model but this is optional if you know what you are doing.

Clone the repo onto your local machine and cd into the directory.

git clone https://github.com/sujithv28/Behavioral-Cloning-Hacktorial.git
cd Behavioral-Cloning-Hacktorial

Here you can create the conda environment if you wish to use one. At the time of development, we used Python 2.7 but Python 3 should work.

conda create -n hacktorial python=2.7
source activate hacktorial

Install TensorFlow following the instructions for your system here. With conda you can install TensorFlow simply by running:

conda install -c conda-forge tensorflow

Install all the Python dependencies:

pip install -r requirements.txt

Set Keras to use TensorFlow as backend:

import keras
nano ~/.keras/keras.json

Change your values to look like this:

"image_dim_ordering": "tf",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"

Also make sure you have the library opencv-python installed either through pip or homebrew. You can check if this works by running and making sure nothing complains:

import cv2

You can download the simulator here.

Finally, download Udacity’s training data and extract it to the main directory by running in the directory (optional but initially reccomended).

wget https://www.dropbox.com/s/3cwc2atg1qorzg4/data.zip?dl=0
unzip -a data.zip?dl=0
rm data.zip?dl=0

If you wish to use your own training data, open the simulator and select any track and the training option. All the data will be saved to the location you specify. Follow the instructions here to learn more on how to use the simulator.

After this point you should be good to go for developing the model. If you encounter any issues feel free to check Stack Overflow or comment on the post below.

Simulator Overview

Udacity Simulator

The simulator has two options upon opening it: training mode and autonomous mode. For testing the model, the autonomous mode is used as it takes in inputs from a program and lets it drive the car. If you want to get your own training data or drive the car for fun, select the training mode. The data will be saved as a CSV file where each row corresponds to a single frame from the training run. Each row contains paths to three images from the car’s perspective (left, right, and center), the speed of the car, and the throttle.

Sample training images of left, center, and right cameras from a random frame.


The template.py is a template for the hacktorial and the only code we will be modifying. The final code as a reference is available to view in model.py. The file drive.py is from Udacity and contains the code needed to connect your model trained in model.py with the simulator.

The training data is only comprised of the first track. The second track which is a different setting is meant to be used as a test track to see how well your model generalizes.

All the code with explanations is located in the Jupyter notebook Behavioral Cloning Self-Driving Car Guide.

Data Augmentation

Augmented Images from the original Image

Data augmentation is a strategy used by many data scientists to improve performance. By flipping a picture of a road turning right and negating the steering angle, we now have an example of a road turning left. We can apply more tricks such as tinting the image to look like you are driving at night time, and even jittering the image so it looks like you are on a different spot on the road (although it is important to adjust the steering angle accordingly here).

Model architecture

The model we implemented is based off of Nvidia’s model described in their paper End to End Learning for Self-Driving Cars. The differences between their model and ours is that we add a Maxpooling layer after every Convolutional layer to speed up computation and also have more fully-connected (dense) layers. We also add BatchNormalization to speed up computation.

Nvidia’s End to End Learning model

The model will take in an image (either from the left, center, or right camera) and output a single number between -1 and 1 representing the steering angle.

We used Keras to implement the model. Keras is a deep learning library that runs on Tensorflow or Theano and is best for fast prototyping. For more intensive deep learning tasks, PyTorch or TensorFlow may be a better option.

Training the model

To train the model and save the parameters, simply run:

python model.py

Testing the model

To run your trained model on the simulator, open up the simulator application and start an autonomous session on either track (recommended to initially try it out on the first track since that is where the training data is from). Then run

python drive.py model.json

Your car should begin driving itself around the track! If the car ever gets out of control and leaves the track, you can manually drive it using the WASD keys.


  • Try changing the image quality in the settings of the simulator for higher resolution training data. Retrain the model and see how it affects the performance of the car.
  • How does the car perform on the second track? What are some differences between the two tracks and how can we overcome them in our data augmentation step?
  • What happens if a car goes out of the track? Does it know how to correct itself? How can we fix that?

Here’s a challenge

In this project we solely predicted steering angles for the car while it drove at a constant speed. Experiment with the implementation to see if you can learn to determine what throttle to give the car. Theoretically, this will help the car drive smoother around the track.

Other resources

Here are some other great blogs written going over their approach to the Udacity Behavorial Cloning Project and explaining the logic behind their approach.

The final project can be found on my GitHub here: https://github.com/sujithv28/Behavioral-Cloning-Hacktorial


If you enjoyed this post, give it a 👏 below so other people can enjoy it too.

Decode Ways

A collection of thoughts, stories and ideas from a computer…