Notes on self driving cars

8 min readSep 27, 2016

software for self driving cars needs to answer two questions:

where am i?
where am i going?

The first problem of determining “where am i?” is called “localization”. this post will go into the details of the intuition behind localization, and less of the messy math details. See Thrun’s “Probabilistic Robotics” book for more detailed information.

In order for a robot to “localize”, the robot must also be “mapping” in order to know its environment, either in real-time or captured via previous precision laser scanned maps beforehand.

for robots, “localization” involves two steps, called predict and correct. the car’s onboard computers are continually running this two step cycle to accurately determine the car’s pose (location). depending on implementation, this two-step cycle runs 10–100 times a second.

the first stage of the two step cycle uses “state and control modeling” to predict motion from onboard measurements and state such as velocity, acceleration, gas and braking.

a) state and control modeling (predict step)— first, given velocity, position, odometry information from the vehicle’s state, wheel turns, and other onboard measurements, as well as controls such as gas and braking, estimate a position solely using previous state, control data and kinematics models.

this pose estimate based on state and modeling won’t be perfect. of course, in the real world, due to differences like wheel grip, tire tread, rain, car behavior, etc. estimates will likely be off.

b) sensor based pose estimate refinement (correct step).

the second step is refinement of robot pose estimates using data measured from sensors. For now, you can think of a measurement as the relative position of some known landmark such as a stop sign. A sensor measurement would be measuring that the stop sign landmark is currently 3 feet away and a heading of 10 degrees to the right. You can intuitively understand that given the map of the environment, and an assumption of where we are, we can calculate a probability of how likely we are to see that stop sign measurement (gee… the stop sign is measured to be 3 feet away! but given this pose and the map of the surroundings it says it should be 30 feet..? means that this pose is not a likely location)

Autonomous car systems heavily use multiple sensor measurements to help measure landmarks (see Stanford’s vehicle that won DARPA grand challenge), Sensors come in many forms. Some for long-distance, some for ground terrain, some using GPS, etc. For an interesting discourse on different types of sensors and how complicated it can get, read Thrun’s paper on the Stanford DARPA car. http://isl.ecst.csuchico.edu/DOCS/darpa2005/DARPA%202005%20Stanley.pdf.

The correct step processes measurements from a multitude of sensors that all help to update the likelihood of different poses of the vehicle. Using the measured landmark (say the relative location of that street sign), the car can update the relative likelihood of every robot pose and determine which ones are more or less likely based on that measurement.

So what is the exact “Data structure” that is continually being updated by this “two-step” cycle?

The robot pose belief function.

The robot pose belief is a PDF (probability density function) that efficiently answers this “where am i?” question.

Robot_Pose_Belief(position_t) = probability of the robot being at position at time t.

In other words, the robot pose belief is a function that takes as input a location and returns the probability between 0.0 and 1.0 of the robot being at that location.

There are a few reasons why a PDF is used to represent the robot pose belief.

the robot pose belief is complete. This is also known as the markov assumption, where the current robot pose belief PDF encodes all past information that has been observed up to the current time. in this model of robot pose, future robot pose beliefs can be wholly calculated from the current robot pose belief and the control information and sensor readings to transition to the next state. no prior information from sensor reading at previous times or previous robot pose beliefs are needed.
the reason 1 is important is because completeness of the robot pose belief function means it’s memory space relative to the time and number of measurements recorded over time is constant. by 1), measurements and probabilities preceding the current time do not need to be stored, giving the Robot Pose Belief function a memory requirement that scales only to the size of the robot’s possible poses in the environment. put another way, the robot pose belief function is the only memory that is used to remember what is going on. without this assumption, every sensor reading and intermediate state would have to be stored, quickly becoming intractable. the markov assumption makes the localization algorithm tractable.
ability to test multiple hypotheses locations — a probability that evaluates across multiple robot poses allows the PDF to “remember” sensor information that biases certain locations, even if they are not ultimately the location that was found to be “most likely” at a point in time. In the face of transient uncertainties in sensor measurements, probabilities and uncertainty allows all positions to be evaluated and biased for future updates, even if they are unlikely at the current time.
using bayesian filtering (explained in detail later), multiple sensors can easily be used as inputs into the robot pose belief. as part of step 2 in the process, we would like to be able to include multiple sensors and their measurements into the robot pose belief in each cycle. for instance, there may be LIDAR sensors on top of the vehicle, but there usually are also GPS, sonar and regular cameras to help with detecting things like further distances and color. Bayesian filtering allows us to efficiently incorporate different sensor readings into a robot pose belief.

Bayesian filtering

We’ve basically described Bayesian Filtering above, but it is important to formalize Bayesian filtering, how it works and specific methods of Bayesian filtering. Different types of Bayesian filtering are used for different problem domains, and multiple types of bayesian filtering can even be used within the same scene. For instance, Google uses Kalman Filters to localize other vehicles on the road, while using Particle filters for localizing the vehicle itself, simultaneously in the same scene.

Regardless of the details of implementation, how the system is modeled, etc., each of these types of Bayesian filters uses a two step cycle of predict and update to recursively maintain the probability of outcomes.

Bayes Rule

Bayes rule is critical to understanding Bayesian filtering.

Bayes Rule:

P(event X | Data A) = P( Data A | event X ) * P(event X) / P(Data(A))

We are going to use two examples to illustrate Bayes Rule, A patient’s probability of having measles given data from a Measles Lab Test, and a Self Driving Car probability of having a specific robot pose given data from a sensor measurement.

In the Bayes Rule formula above, event X is a random variable. This means that you are solving for the probability of a specific hypothesis (outcome). In fact, in most of the applications of Bayes Rule you will be using Bayes Rule to solve for probabilities of different hypotheses. In the self driving car example, you will use Bayes Rule once for each event, where the event is that the robot has a specific pose. These poses means you will have a different Bayes Rule calculation for each robot pose, [0,0,0], [1,0,0], [2,0,0], [3,0,0] and so forth, where [x,y,z] are locations in the X,Y,Z axes in the current environment.

Bayes rule gives us a calculation that allows us to, given a prior probability of event X occurring, update the probability of an event X given some new data (Data A in the formula)

With Bayes Rule, we always have a prior belief probability, P(event X) in the equation. This is the probability of an event happening, either a patient in the general population having measles, or the robot having a specific pose location, before incorporating new data.

The term P(Data A | event X) describes the probability of, given that we assume that event X has happened, what is the probability of measuring Data A?

In the measles example, this is the probability of having a positive result given that the patient has measles.

In the below example, event A will be a patient having measles.

Let’s say the probability of that patient having measles in the general population is 1%.

P(event A ) = 1%. This is also called the “prior”, because it is the prior belief before any data has been incorporated.

Now, let’s say that the probability of having a positive test result given a patient definitely has measles is 80%.

Given this, we can measure the likelihood of P(having measles | positive test result). Notice a smaller prior P(having measles in the general population) affects the P(measles | positive test result) proportionally. Notice a lower probability of having a positive result in the generation population makes the probability of having measles given a positive result a higher probability. Both of these intuitively make sense!

Bayes Rule in the context of self driving cars:

P(robot location | Sensor Measurement ) = (P(sensor measurement | robot location) * P(robot having robot location previously)) / P (Sensor measurement).

Important Note: Again, to repeat what was said above, the robot location in the Bayes Rule equation is a specific robot location… In the actual implementation, many Bayes rule values are calculated, one for each location.

This concept of Bayesian filtering shows up in multiple forms:

Discrete Bayes Filter:

The simplest version of bayesian filtering is a discrete bayesian filtering. When the domain of events are discrete quantized positions, one can use a discrete bayesian filter to recursively update probabilities of location of the robot.

Going back to our original description of bayesian filtering, the first step is the predict step, that models the transition of states from time t to t + 1. For the predict step of the discrete Bayes Filter, it requires a special operation amongst discrete locations, called a convolution. A convolution is a summing of the the joint probabilities of the possible previous locations and the probability of landing at the current position.

The values are then updated with the probability of seeing the measurement given each individual position.

After this calculation, the most likely location is the one with highest probability. This calculation repeats endlessly.

The discrete Bayes filter is the simplest type of filter, but it clearly illustrates bayesian filtering and how probabilities can be used to incorporate sensor measurements and predictions.

Other types of filters:

A walkthrough of other types of bayesian filters. You can investigate these in more detail. All of these are used in self driving cars.

Kalman Filter — a continuous filter that uses gaussians to model linear models. Used to model motion of other vehicles on the road for instance. The correct step of the Bayes Filter, simply updates each location with the sensor measurement. The math for this is pretty hairy (in fact I still don’t fully understand it) but there is a value called a Kalman Gain which is calculated from the standard deviations of measurements to determine on a 0.0 to 1.0 scale how much to bake into each correct step the new sensor reading.

Extended Kalman Filter — a kalman filter that “approximates” a nonlinear system by linearizing an approximated model at the most likely location. In multidimensional systems this is done via a Jacobian matrix.

Particle Filter — discrete filters have the problem of wasting resources on areas of the hypothesis space that are unlikely. A car might not be in some part of the area, but a discrete filter would evaluate the likelihood of poses in that area regardless. Continuous filters have the issue that it can be difficult to create closed form solutions for nonlinear real world systems, and can be computationally expensive. particle filters are effectively “discrete kalman filters”, except that at the end of each cycle, the next locations that are calculated for the next cycle are “resampled”, thus creating new “particles”. over time, the most likely particles will survive, in areas of the hypothesis space that are most likely. this is an example of monte carlo methods and bayesian filtering. Particle Filters are generally what is used in most implementations, because one can easily modify the number of particles in order to fit the memory and computation requirements of the hardware.