Self-Driving Car with YOLOv5 and Roboflow

Herambh Dakshinamoorthy
Geek Culture
Published in
6 min readOct 19, 2021

Have you ever been intrigued by self-driving cars and the algorithms behind them? This post lets you dive deeper using Udacity Self Driving Car Dataset provided by Roboflow.

Photo by Roberto Nickson on Unsplash

YOLOv5

YOLOv5 inferencing live on video with COCO weights

YOLO You Only Look Once is a family of object detection architectures and models pre-trained on the COCO dataset. YOLOv5 🚀 represents Ultralytics open-source research into future vision AI methods.

YOLOv5 is as quick as lightning. It provides State-of-the-Art Object detection with the most recent speed benchmarks of 1666 FPS on COCO as per reports. YOLOv5 is also amazingly accurate. All this, combined with the fact that YOLOv5 is much smaller in size, makes YOLOv5 one of the best architectures for Self-Driving cars, if not the best.

YOLO divides an image into a grid system, and each grid detects objects within itself. This allows them to perform very fast with limited computational resources.

To understand how Yolov5 improved the performance and its architecture, let us go through the following high-level Object detection architecture:

Yolov5s model

Steps

We will walk through the steps required to train YOLOv5 on Udacity Self Driving Car Dataset.

You can refer this Google Colab notebook and follow along.

To train our detector, we take the following steps:

  • Preparing the dataset
  • Install YOLOv5 dependencies
  • Download Self-Driving car object detection dataset from Roboflow
  • Setup the YAML files for training
  • Training the model
  • Evaluate performance
  • Visualize the training data
  • Running inference on test images

Preparing the dataset

We will use the Udacity Self Driving Car Dataset,, which is available in Roboflow Public Object Detection datasets.

The dataset contains 97,942 labels across 11 classes and 15,000 images. There are 1,720 null examples (images with no objects on the road). The 11 classes include cars, trucks, pedestrians, signals, and bicyclists.

You can also add your annotations on images using Roboflow’s handy annotation tools. Roboflow also allows you to streamline the data preprocessing and augmentation steps and perform the train test split as per your choice!

These are the settings I have applied to my dataset.

Install YOLOv5 dependencies

We first clone the YOLOv5 repository and install dependencies. This will set up our programming environment to be ready to running object detection training and inference commands.

Google Colab comes with a preinstalled GPU accelerator which allows us to accelerate our training. Make sure to adjust these settings before beginning and import the required dependencies.

Download Dataset from Roboflow

Using Roboflow you can download the dataset, create a free account and upload your dataset in your workspace as a project. After annotating, applying the desired preprocessing and augmentation steps download the Code snippet in the YOLOv5 PyTorch format. I have resized all the images to 416 x 416 pixels and augmented 5% noise so that the model learns better.

The noise that I have applied makes it seem like it is raining, making my model equipped for instances where object detection occurs in such conditions.

Setup the YAML files for training

To train a YOLOv5 model, we need two YAML files.

The first YAML to specify:

  • the names corresponding to those classes
  • the number of classes that we want to detect
  • where our training and validation data is

This YAML of ours looks like this:

The second YAML is to specify the whole model configuration. The network architecture of this can be modified and the correct num_classes is also to be specified.

Training the model

We can play around with the following hyperparameters to obtain best results with our model:

  • img: define input image size
  • batch: determine batch size
  • epochs: define the number of training epochs
  • data: set the path to YAML file
  • cfg: specify model configuration
  • weights: specify a custom path to weights
  • name: result names
  • nosave: only save the final checkpoint
  • cache: cache images for faster training

Evaluate performance

Having completed the training, we are now ready to evaluate our model’s performance. We can visualize the performance using Tensorboard.

Training Metrics
Training Losses

Visualize the training data

During training, the YOLOv5 training pipeline creates batches of training data with augmentations. We can visualize the training data ground truth as well as the augmented training data.

Our training data ground truth
Training data with augmentations

Running Inference on Test Images

We now take the trained model and make inference on test images. We use the weights that were created when the model was trained for the inference.

Let’s visualize our results:

Predictions on test images

Conclusion

A world filled with Self-Driving cars seems to be just a dream right now. But with further developments in this field, I’m sure that this would soon be a possibility and not just a daydream.

The illustration in this post is just a mere instance of object detection — a fundamental building block. The real challenge comes with the 3-D perception of these objects and how the car is programmed to respond to these objects. Certain factors in the surroundings, such as weather and below-par infrastructure, also influence the performance of these cars.

Including these courses as a part of the curriculum will gather interest and help make further research and development in this field. This will undoubtedly accelerate the process and make autonomous vehicles a reality pretty soon. Hope this post would help inculcate some interest among you all!

--

--

Herambh Dakshinamoorthy
Geek Culture

Passionate Mechanical Engineering Undergrad merging Physics, Math, and AI to revolutionize the Automotive Industry for a cleaner, sustainable future.