Detecting lanes

5 min readJan 3, 2017

This project involves tracking the lane the car is driving in, by detecting and extrapolating lane-markers. In order to simulate the lane detection, we process a driving footage video frame by frame and mark our predictions of the lane on the video. The input frame and the transformed frame should look like this:

Colorspace mapping

Since we are only trying to detect linear features and don’t want to be distracted by colour, it makes sense to shift to a colour space which is unidimensional. Grayscale is an obvious choice but it has a bias towards white. For example, if our lane marker is in yellow, it might have a colour value of `RGB(255,255,0)`. When converted to grayscale, we get an average of the RGB values which is still lesser than that of white (`255`). Converting to HSV gives us an alternative since it separates the hue, saturation and value and allows us to focus on the last dimension. To demonstrate, here is an actual image converted to HSV and grayscale. You can see that even with yellow lines, we are able to get clearer edges with HSV than with grayscale.

Gaussian blur

Another aspect that can disturb real features is noise. A bright mark on the road or pavement could be mistaken as part of a lane. To prevent such noise from affecting our algorithm, we perform a smoothing or blurring on the image by subject each pixel’s value to a Gaussian function that gives maximal weight to the pixel and a decreasing weight to the neighbouring pixel.

Canny edge detection

Canny edge detection is a popular algorithm to find edges or borders of features in an image. It works by calculating gradients (differences in values between adjacent pixels) and selecting only the locally maximal pixels to be part of edges. Therefore the resultant edges are always “thin”. The algorithm is parameterised by two threshold values. Any edges with an intensity gradient less than the low threshold are rejected and any edges with a value above the high threshold are selected. The edges with intermediate values are selected only if they are connected to “strong” edges (with values above the high threshold). Small pixel noises are automatically rejected so the algorithm favours long lines.

Region masking

We can focus on our region of interest by masking out the uninteresting portions. From the camera’s perspective, we only need to clip a trapezium from the bottom of the frame to focus on the lane lines. The following code calculates the required vertices based on the size of the image.

offx = int(maxx/14)
marx = offx/4
offy = int(0.55*maxy)
vertices = np.array([[(offx, maxy), (maxx — offx, maxy), (maxx/2 + marx, offy), (maxx/2 — marx, offy)]], dtype=np.int32)

Hough transform

The part remaining is to discover lines in the edge pixels. Before attempting this, we need to rethink a point in terms of all the lines that can possibly run through it. Two points will then have their own sets of possible lines with one common line that runs through both of them. If we could plot the line-possibilities of these two points, both points will “vote” for that line that passes through both of them.

In order to plot the line-possibilities, we choose “Hough space” which is a plane with line-parameters — rho (distance of the normal from origin to the line) and theta (angle made by the normal with the x-axis). Now, in order to plot a point, we need to plot all the possible lines through it — which happens to be a sinusoidal curve in the Hough space (more details here ). When two points are collinear, their sine curves intersect in the Hough space. Therefore, the problem of finding a line through points transforms into a problem of finding intersections in the Hough space.

As parameters to the algorithm, we provide a kernel size (rho and theta) for the search to proceed through the space and also the minimum number of “votes” (or points in the line) that would qualify the line to be chosen. In a probabilistic variant of the algorithm we provide a minimum threshold of edges that must be checked to reduce the search space. By providing a maximum line gap and a minimum line length, we can tune for longer lines to be chosen by the algorithm. As a result, we get end points of detected lines

Consolidation

We know that the lane lines are tilted in a certain way. So, we can reject outliers with very small or large slopes
We segregate the lines into line-groups based on their parameters (slope, intercept) with a small tolerance for comparison
Next, we partition into positive and negative line-groups to represent the right and left lanes
From each partition of line-groups, we pick the line-group that has the maximum cumulative length. We end up with one group of lines for the left lane and another for the right lane.
We take their weighted averages to return the left and right lines

Extrapolation

Using the line parameters, we extrapolate lines from the bottom of the frame to near the horizon

Bonus: Finding lanes on “real” roads :-)

Considerations

Linear regression:

This was considered and rejected in favouring of weighted averages since I wanted to factor in the line lengths.

Temporal smoothing:

The resultant video is jittery but I was hesitant to smoothen between frames because to me, it seems to be a premature optimization. My instinct is to preserve “reality” until the stage where decisions have to be made. For e.g., it seems more prudent to smoothen steering action later in the pipeline

Absent lanes:

There are a couple of frames in the challenge video where my code does not find any right lane. I am guessing that we could reuse predictions from previous frames or extrapolate a lane based on the slope relationship between the two lanes. In both cases, it would require global information across frames, so I didn’t venture into fixing this.

source: https://github.com/subhash/CarND-LaneLines-P1