AIsaturdaysOgbomosho Week 12- Autonomous driving car detection

Lautech DataScience
4 min readJul 25, 2018

--

This is a review of what we learnt today 21st July, 2018 using Andrew Ng’s Deep Learning specialization course. To accelerate our learning, we watch the videos during the week and do a walk-through the code while solving the programming assignments on weekends.

This week’s task was learning about object detection and building bounding boxes using the very powerful YOLO model.

YOLO (“you only look once”) is a popular algorithm that achieves high accuracy while also running in real-time. This algorithm “only looks once” at the image in the sense that it requires only one forward propagation pass through the network to make predictions. After non-max suppression, it then outputs recognized objects together with the bounding boxes.

Problem statement: Using a dataset of pictures (with 80 different classes) collected from a camera mounted in front of a car which takes pictures of the road ahead every few seconds while driving, we are to walk through the build up of how YOLO works, then apply it to car detection.

Model details

  • The input is a batch of images of shape (m, 608, 608, 3)
  • The output is a list of bounding boxes along with the recognized classes

If the center/midpoint of an object falls into a grid cell, that grid cell is responsible for detecting that object.

Each cell gives you 5 Anchor boxes. In total, the model predicts: 19x19x5 = 1805 boxes just by looking once at the image (one forward pass through the network)!

Different colors denote different classes. We plotted only boxes that the model had assigned a high probability to, but this is still too many boxes. We solve this first by filtering and then using non-max suppression.

  1. Filtering

2. Non-max suppression (NMS): Even after filtering by thresholding over the classes scores, we still end up with overlapping boxes. NMS uses an important function called “Intersection over Union” (IoU) to solve this problem.

Testing YOLO pretrained model on images

Training a YOLO model from scratch is computationally expensive and requires a fairly large dataset of labelled bounding boxes for a large range of target classes. ( the last time I checked the cheapest GPU is more than my current tuition 😝)

Creating a session to start the graph and loading an existing pretrained Keras YOLO model stored in “yolo.h5”

yolo_model = load_model("model_data/yolo.h5")

Testing the model by running the session in a for loop over all images. Here’s the result:

Lessons learnt

YOLO is a state-of-the-art object detection model that is fast and accurate

It runs an input image through a CNN which outputs a 19x19x5x85 dimensional volume (for a 19x19 grid and a task with 80 classes)

Training a YOLO model from scratch or fine-tuning it with a custom dataset can be difficult without a GPU.

REFERENCES

Thanks to Redmon et al. for their wonderful papers, Andrew Ng for his awesome course content and drive.ai for providing the dataset.

AISaturdayOgbomosho wouldn’t have happened without fellow ambassadors Temiloluwa Ruth Afape, Adegoke Toluwani and Mhiz Adeola Lawal, our renowned Partner Intel.

Thanks to our Ambassador Daniel Ajisafe for the write up.

A big Thanks to Nurture.AI for this amazing opportunity.

follow us on twitter

--

--

Lautech DataScience

A community of data scientists and AI practitioners in LAUTECH.