How do self driving cars drive? Part 1: Lane keeping assist

6 min readJun 4, 2017

You’re sold on self driving cars (SDC), right? You’ve read so much about the future of mobility and how disruptive self driving cars can be. You even keep coming back to this blog… Now, let’s try to get some first principles about how they work and hopefully see what we can do to contribute to self driving progress.

Putting it simply, self driving car is about getting a 4k pounds, 4 wheeled robot to follow a path. That path can be calculated globally, say go from A to B like a taxi and/or locally, going forward in a highway without departing the lane. In the process, the car should also adapt its speed to not hit objects and plan its path to avoid obstacles.

Now, let us assume the simplest case possible. We have a car in a highway without any other vehicles around and we want it to go forward keeping itself in the center of the lane. We will discuss more general cases in other posts or something… For now, see Figure 1.

Figure 1. Highway lane keeping assist. We want the car to drive itself in the center of the lane.

If you ask me, a human, how I’d solve the problem of keeping a car in the center of a lane. I’d say I would verify how far I’m from left and right lanes. If I’m too close to the left lane, I’d steer the vehicle to the right and vise-versa. I’d use my eyes and spatial intelligence to estimate how far each lane is. Same thing for speed. If the car is too slow I’d press the gas, if it too fast I’d press the brakes. How much force or torque I’d place on the pedals and the steering wheel was heuristically learned since I was 18, back in Brazil. Let us forget about manual gears for now, since millennials don’t know about it anyways…

The intuition above involved actuators (steering wheel and pedals), state measurement (how far I am from each lane and my current speed) and the knowledge of a desired state (keep myself in the middle of the lane as best as I can while following the speed limits). All that is represented in the block diagram of Figure 2.

Figure 2. Block diagram of the closed loop controls for self driving cars.

Assume next that actuation is granted because we hacked our cars to steer it programmatically. The problem reduces to measure where the car is in the lane and how to control it to the desired state.

Sensors: getting the car to find where it is

There are several ways to calculate how far the car is from each lane. If we want to use computer vision, we can place cameras pointing forward or sideways to see the road. Classic computer vision uses edge detectors to estimate the lanes. We can also use deep learning to find the lanes — to train that we will need ground truth positions that comes either from hand labelling (not scalable) or from other sensors like LiDARs (better). LiDARs themselves are pretty good for finding lane lines because they reflect light differently from asphalt (surprise!).

Once we segment out the lanes in the image, we can use geometry to calculate their distance to the car. Assuming the cameras are stable in the car, we can figure a pixel to inches transformation. I may write later about computer vision and geometry for SDC, but if you are really curious already, check out this book. In a paper by Ford, they used sideway cameras and calculated the position of the lanes using deep learning and hand labeled images.

Speed measurement is a bit easier to estimate and already provided by the car internals.

Controls: actually self driving

Once we know where car is in the road and how fast it is going, we have to control it to the desired position and speed. Another simplification is to assume we can divide the problem in two: Longitudinal (how much throttle and brake) and Lateral (how much torque on the steering wheels) controls. That break down will not always work. But we may be able to do that in a highway, where curves are gentle.

The simplest controls method is the Proportional Integral Derivative (PID) controller. Note that there are several other ways to do controls, but we will stick to this one for simplicity. To understand how PIDs work save me some words and watch the (very good) video below:

Video 1. PID controls for self driving cars

If you watched the video you know that there are 3 parameters to be defined for each controller. In a perfect world, we would know exactly how each actuation command moves the car. In other words, we would know the function that gets steering wheel angles and speed as inputs and outputs the car movement. In real life, we don’t know that function and modelling car dynamics is a kinda hard. See this book if you believe that Physics or it didn’t happen.

We need that vehicle dynamics function to define the best PID. Since that model of the car might not be perfect, in practice engineers usually have to do hand tuning of the PID until the car drives well enough. If the PID or whatever controls strategy was used to steer the car was not well tuned, we may get a system that wobbles the car between lanes.

Machine learning can be used for estimating a good vehicle model. When we have a good vehicle model, there exist other controls methods that may work better, like LQR, model-predictive control, etc. Check out what the Stanford helicopter project did for instance.

A note about end-to-end self driving cars

In the context of SDCs end-to-end driving refers to using deep learning to get a camera feed as input and output the driving decisions a human would take to drive the car. That doesn’t work well yet. Yes, we can learn to regress images to steering wheel angles using deep learning. That doesn’t mean, for instance, that the deep neural network learned the physics of the problem. This is an innovative strategy and kinda works for defining steering wheel commands that goes to the PID loops as the desired path instead of center of the lane. This is particularly useful when there are no lanes to be detected. But neural nets generalize only so well as the test data follows a similar distribution to the training data. For example, neural nets have a hard time doing position recovery once the car start departing the lane. Think about it, we don’t usually collect self driving car data falling asleep and waking up to recover the right position, neither do we drive recklessly to get the car in a weird positions in public roads for training purposes.

I think that learning rules of physics from video is one of the most important lines of Machine Learning research right now. But there is only so much we can learn from a static dataset as used to train end-to-end self driving cars. I argued before that something like OpenAI Roboschool would be a much smarter and principled approach.

Open problems

If you were paying attention until now, you saw how many simplifications we had to make: highway only, separate Longitudinal and Lateral controls, imperfect vehicle dynamics model, simple PID controls, ground truths to find lanes in the road, etc.

Out of those problems, the ones regarded as the hardest are the perception problems, i.e. finding the lanes and drivable paths. That is fair to say since engineers know controls well enough for quite sometime now. If machine learning/computer vision people can come up with good vehicle state and surroundings estimation, controls people are competent enough to know exactly what to do next. This helps to understand why most of the startups in SDC pitch themselves as localization or perception companies.

That said, the way we discussed SDC in this post will only help solve the simplified highway problem. The full story will require dynamic path planning, handling other vehicles and people around in the streets. But let’s leave further discussions to future posts.

Acknowledgments

Thanks to Sam Khalandovsky for suggestions to the draft.