Vehicle detection

You can find my code here. It might be more up to date than the article ;).

Intro

This is the fifth article in the self driving cars series. If you want to know why I’m sharing this and more about my journey, please read this.

Goal

For each image of the given video, we are going to identify cars and draw a bounding box around them.

Vehicle detection final result

Algorithm: Step by step

This algorithm is executed step by step on each frame:

1: HOG features extraction for learning

I first computed the HOG of many car and non car images.

HOG, or Histogram of Oriented Gradients is a technique that maps all the location of the gradients (change in color or luminosity) in a picture, as well as their orientation.

car and non car image
car and non car HOG

2: Spatial binning

We can perform spatial binning on an image and still retain enough information to help in finding vehicles.

This allows us to filter out some of the noise and only retain the most important features of the car, while making the algorithm run faster as the resulting images are smaller.

Spatial binning example

3: Color channel histograms

This will give us pixel intensity for each image, allowing us to use the common car colors as a feature for recognition.

color histogram example

4: Training a classifier

With these 3 features (HOG, spatial bins, color histograms) we train a classifier to separate cars from the rest.

I did the training using a pretty standard method:

  • I first standardised the computed features using a standard scaler
  • Then I split my data into training and test sets (80/20), randomising them in the process.
  • I finally trained a standard SVC with default settings.

I also stored the results of the computed features, scaler and SVC to pickle files to avoid having to recompute them every time.

5: Final choice of parameters

Using the accuracy of the classifier, I tweaked the parameters to obtain a better result.

For the HOG, changing the number of orientations, of pixels per cell and cells per block never gave significantly better results. So I mainly tweaked the color space and channel used, in particular RGB (red or all channels), HSV (every combination) and YCrCb (all channels).

YCrCb on all channels gave the best results, with the original HOG settings: 8 orientations, 8*8 pixels per cell and 2*2 cells per block.

6: Sliding window

I then used a sliding window approach to search for cars in the image.

I restricted the windows to the lower half of the image. I also added different window sizes, ranging from 64*64 to 128*128.

Based on the size of the window, I also further restricted their location. For example, 64*64 windows are only computed on the top half of the lower half of the image as cars located on the lowest part of the image are supposed to be up close and thus appear bigger.

sliding window

Finally, I made my window overlap a lot. We’ll see why in the next step.

7: Heatmap

The classifier not being 100% correct, we need to handle false positives.

I used a simple heatmap to do that. The overlapping windows allowed me to set a higher threshold on my heatmap which gave me significantly better results, detecting more true positives and less false positives.

By saving the detection results over the last five frames using a circular buffer and applying a threshold on the resulting heatmap, I get a result like this:

Final heatmap

Here is the full video:

Thoughts

Obviously what we did here is naive implementation.

The pipeline is likely to fail if :

  • Worse weather/lighting conditions.
  • Less contrast between the cars and the road in general.
  • Going up/down hills.

We used classical computer vision techniques and added a SVC classifier in this project. It’s really interesting to understand and get familiar with these concepts.

But again, I’d be very interested to see how deep learning would tackle this problem.

Next time, we’ll use Kalman filters to track the car in motion thanks to Radar and Lidar data! and all coded in C++ for real time execution!

Stay tuned.

Thanks to Eric Sauvegrain for his feedback on this post.