Advanced Lane Finding

13 min readAug 7, 2017

I recently completed the Udacity Self-Driving Car program’s fourth project, where we draw lane lines over a video that is provided to us.

I won’t lie, this was the toughest project of the course so far to me. I’ll get into why that is in a bit, but I thought I’d first give you my tldr takeaways:

Computer Vision is a Dark Art
Jupyter Notebooks have a purpose, but they shouldn’t be used for your real code
Linters are awesome

When I first started this, I thought this would be fairly straightforward. I was wrong. There are two areas where you really need to do a lot of thinking.

How the heck do we process the image to reduce it to the lane lines for detection?
How do we fall back and deal with bad frames?

But let’s start at the beginning. First, the goals of this project were to:

calibrate for camera distortion
clean up/process the image so we can isolate the lines we care about
fit a polynomial over the detected lines
use said poly to determine the curvature of the road and the offset of the vehicle in the lane
draw an overlay of the lane, as well as the info above, as you see in the picture above.

Calibration

The first thing we had to do was calibrate the camera. Every camera lens I am aware of has some level of distortion, but in particular dash cams are pretty distorted due to the wide field-of-view they capture. In order to ensure that we are computing the road curvature properly, we need to undo this distortion by calibrating against pictures of a chessboard pattern we take with the camera.

There are basically two stages to calibration. First, we use a function in OpenCV called findChessboardCorners. We put all the images through this function, giving it our expected number of corners. For each successful call to this function, we then add the points we get back as well as a set of ‘normalized’ corners. With these in hand, we finally call calibrateCamera in OpenCV and get the matrix and distortion coefficients we will need to later call cv2.undistort to straighten images.

Creating a Binary Image

The next step is where a lot of the dark magic comes in. In order to do detection, we want to create an image which consists of only the lane lines. This image is a black and white image, where the white parts are the lines.

The big problem here is how to do it. We were shown several techniques: Sobel filtering using absolute value, magnitude and direction, as well as color thresholding.

The problem here is that once you get something that seems to work, you’ll encounter a part of the video where it just falls apart. Either it will be in shadow, or the road surface is really light and contrast drops sharply. Then you’ll fix that and some other thing will stop working. It was a perpetual game of whack-a-mole.

To experiment, I used a Jupyter (iPython) notebook. Eventually, I tired of blindly changing values and re-running the code cell and started to use iPython Widgets. These are super helpful for this sort of thing where you are trying to adjust values and see the result. I highly recommend them. I ended up setting up some sliders to play with the values.

I tried so many permutations I lost count. At one point I thought I had a solution in using Sobel to detect lines coupled with color thresholding by AND-ing them. Brilliant! Only it didn’t work. I’d encounter yet another place where it fell apart.

One of the issues with Sobel detection is that it works on a grayscale image. If you take a grayscale image of a light road, the yellow line basically disappears, so it picks up nothing. That meant I had to use thresholding for yellow lines, and OR that into my resultant image.

Turns out even that is hard. I ended up using a combination of RGB thresholding as well as some HSV combined. In the end this detection worked quite well. Of course, there’s more to life than yellow lines.

For white lines, I mostly relied on RGB thresholding, but I also tried luminance as well as red-channel. L and R channels work really well, but when the road gets light it can really blow out, so in the end I went with a blend of my yellow line magic above, a simple white line threshold in HSV, and the red channel threshold. It worked well enough for the main project video and challenge video. I did not try the harder challenge video with that version of my code.

Go To Warp!

Next up is warping the image. Basically, we want to take the image and project it so that it looks like a top-down view of the road. It’s ultimately the same trick that cars today use for a 360 view.

This helps us determine the lane curvature later.

In my code, I actually do the warp before I do the color thresholding. I think you can do it either way, however.

We do the transform using cv2.getPerspectiveTransform(). The source rectangle is a trapezoid covering the lane and the result is the rectangle you see above on the right.

Detection

Next, we actually do our line detection. I ended up using the first method given to us in class where we gather the indices of pixels in a numpy nonzero array of our prepared image. I tried the variant where they used convolutions, but that didn’t seem to work quite as well for me. At one point, I also had a very complicated scanning system in place, but in the end I decided to go back to the first version they gave us just for simplicity.

The method starts with using a histogram of the lower half of the image to help find where the concentration of lines is. Next, we try to use the locations that seem like where they are as a starting point and scan upward using a sliding-window technique. As we scan upward, we shift left or right based on the mean of where the detected pixels are. This works well for most simple highway driving. But on twisty roads it can easily fall apart. This was why I ended up devising my complicated scanner. But as mentioned, in the end it was more important to get the main project video working and satisfy the requirements of the project, so I reverted to this simpler method.

In the end you end up with an image like the one above. You can see the rectangles where we were searching, as well as the line pixels we decided were part of the left and right lines (in red and blue, respectively). As long as your image is not very noisy, this works well. This is why the thresholding/image prep is so important.

You might also notice a second line on the right line. I draw the current detected line in yellow, and the average line over the last 5 successful detections as well. This latter line is what we use to draw the overlay on the video with. So we get a bit of smoothing to help out when detection goes a little sideways (literally, at times). I decided on 5 as the memory size because any more starts to become really visible and laggy.

The Complicated Variant

I thought I’d go into my more complicated method of scanning for lines, just to give you ideas for your own work. Keep in mind that I abandoned this, but I think it has some good ideas that I might want to work back in someday.

To start, it used smaller windows than in the image above. I think they were half as wide. I would start in the center of the bottom and scan left or right until I found something. If I didn’t, I’d move up and start scanning from the center again. This is necessary when you have dashed lines and there’s nothing at the bottom of the image. I should point out that in this system, I did not use a histogram to find the lines. This was because in the harder project video, the lines curved so much that it became less and less possible to rely on the histogram.

Once I latched onto a line, I would then compute the polynomial fit right away. Then I would use that line to control how I moved. Also, I would start with the simplest poly I could (a line, if the points allowed), and use that. I had a method to help choose the best polynomial to fit the line. I would get more complex and see if the fit improved substantially. If not, keep the simpler one. The problem here is that occasionally, the pixels would cause me to shoot off into some crazy direction. So I also added the notion of looking where (if at all), the line intersected the rectangle I was scanning in. I would follow either the line, or the direction of the intersect, whichever would produce more pixels. Then I’d refit the line and continue like this.

This model actually followed the really curvy stuff in the harder project video really well, but it was not perfect, and I had some bugs that keep causing infinite loops. This, plus my inability at the time to get really clean lane lines, especially on that harder project video, caused me to rethink the complexity of what I was doing and go back to square one and focus on getting clean lanes on the simpler videos and go back to the simpler sliding window technique in order to finish this project.

But I still think there’s something here that I might want to revisit someday in my copious spare time.

How To Know When It Goes Wrong

This all works great when you have nice clean lines. But that is not the case much of the time.

There are many reasons detection can go awry. You might not have enough pixels for a line to even decide to try a polynomial fit. Or it might give you some really wacky polynomial fit due to either the line being too short (maybe the frame has the end of a dashed line at the bottom and nothing above it), or too noisy (in which case the fit might get pulled in the wrong direction). How can you successfully detect these situations and combat them? I tried a lot of different methods to try to figure out when things went wrong.

The one I stuck with most of the time was using the derivative of the line at a certain y coordinate. In the end I just used 360, the middle of the frame. If the line diverged more than a certain threshold, I’d declare it invalid.

I also tried using two derivatives at the same time, but I didn’t feel it helped any more. I next tried using physical offsets at the bottom and top of the frame to see if the line more more than some threshold. But that didn’t really work all the time. It was also not a great test, as if the car is going around a twisty bend, the top of the line will move pretty quickly in the direction of the curve.

I also tried about 3–4 other variants and techniques, but as mentioned, in the end I just stuck with derivatives.

And once we figure out that the line can’t be detected, we just throw it out and end up rendering the last best line we had. We do this only for a certain amount of time, and then we just flat out give up.

Determining Lane Position and Curvature

The last part of this project was using all of this to determine your current lane position (where you were relative to the lines) and the curvature of the road.

Here’s where having a top-down view is super helpful. At this point, we already have a polynomial fit of the line and can determine the curvature of the line with the technique found in this tutorial.

We also assume the center of the car is at the center of the image. So measuring the distance from the center to each of the lines is easy enough.

But we are measuring everything in pixels and not in real-world units. So it was important to estimate the pixels-to-meters ratio. In my project, I computed it to be:

ym_per_pix = 3 / 88  # meters per pixel in y dimension
xm_per_pix = 3.7 / 630  # meters per pixel in x dimension

And I used that to scale the values I was using in order to create the numbers you see on the image at the beginning of this page. I showed the left and right curves, but I actually thought about showing the curve from the center of the car, which I think is more useful. Maybe someday I’ll fix that. I should also make it say left/right instead of positive/negative numbers to be clearer.

Use the Right Environment

I wanted to touch on points I made at the beginning of all of this about using Jupyter Notebooks for what they are good at, and nothing more. That also ties into the linting comment, too, actually.

I started all of this work in a notebook. It’s a great way to write chunks of code and make sure that things behave as you step through what you need to do. It’s also a great place to do experimentation. For example, on the color thresholding, it’s a good way to see results immediately. Change the values, re-run the cell and see what happened.

But even that wasn’t fast enough or good enough for me, so I found iPython Widgets and used the interact function. This allows you to create some widgets and have a function called each time the values of those widgets change. In my case, I used sliders to change the thresholding values. I cannot tell you what a time-saver this is. Wow.

But over time, I was writing the main code for detection, etc. and I kept changing it or refactoring or renaming things and that’s when the notebook environment fails for me.

Let’s say you define some variable that was a global. Now you make it a local in some function, and let’s say you also rename it a bit. You might start to see some weird stuff happening. And then you spend a while trying to figure out that what happened is that you are still using the old global name someplace in your new function. This happened a lot. It’s because the kernel still knows about your global.

So eventually, I gave up and just used a separate python file. This way it’s loaded each time and you’ll find those types of errors right away. And if you add a linter on top of that, you should be very well off. If you are using Python 3.6, you can also take advantage of type annotations to make your code a lot more type-safe. Once I was in Sublime text editor, I could use Anaconda there and the linter and wow. Much better. I highly recommend this path, even if you don’t lint, but do start with your own python file and don’t get sucked into doing it all in the notebook. It will allow you to keep all your hair and save you a lot of time.

My Code

The code for this is available on my github.

I ended up creating an object called, surprisingly, “Processor”. It is the main driver of everything in the app. It in turn uses a couple of objects, Line and LineDetector. Line is our persistent store where we keep our latest and best fit lines, as well as some debugging information we use when we want to see what we are detecting, as I show above. LineDetector just uses the lines and does the raw detection for a frame.

I actually want to refactor this, as I’m not happy with the way it is right now. I really want to make the lines be owned by the detector, and make the detector a long-lived object which simply uses the Line objects as storage. But honestly, I’m not sure when I’d be able to get to that.

You’ll actually see some code commented out I was playing around with, and I’ve currently turned off the code to try fitting the line first to the last known line before deciding to do the sliding-window technique.

I also have debug modes that I was using. I would always recommend such options. I was able to just process a single image, and I also was able to just render the ‘binary’ version of the video to see how my color thresholding was working. Lastly, I had a full-debug mode which would dump the original frame as well as my detection frame. At the same time, this would append into a CSV file so I could graph the polynomial coefficients or derivatives if desired. I output the original frame so that if there was a problem with frame 96, I could just feed it through alone and play with settings in my notebook or through the Python script.

Takeaways

I think the biggest takeaway for me on this project was to jump to real Python as quickly as possible next time. Too much time was lost in the Jupyter Notebook. At the same time, it’s fantastic to use a notebook for things like the color thresholding with help from iPython Widgets. So I learned some useful things for next time, and I plan to use this knowledge well.

I hope this page helped you in some way! See you again after Project 5 😀