YOLOv4- Speed & Accuracy

Susant Achary
May 23 · 5 min read

YOLO (You only look once) but more sharper !!!!

YOLOv4

Last few years object detection has starts maturing in ever since R-CNN was released, the competition remains cut-throat. YOLOv4 has again claim to have state-of-the-art(SOTA) accuracy while maintains a high processing frame rate. It achieves an accuracy of 43.5% AP (65.7% AP₅₀) for the MS COCO with an approximately 65 FPS inference speed on Tesla V100 as per the graph below. In object detection, higher accuracy & precision is few of many things we definitely want . We want the model to run smoothly in the edge devices like Rasberry Pi, Jetson Nano, Intel boards. How to process streaming real time video with these low power and low cost hardware becomes important and challenging pushing the need to get for robotics ,business and much more.(Code is Shared in the end with Video walk-through)

YOLOv4 is twice as fast as EfficientDet with comparable performance.

The YOLO v4 release lists three authors: Alexey Bochkovskiy, the Russian developer who built the YOLO Windows version, Chien-Yao Wang, and Hong-Yuan Mark Liao.(Unfortunately, the Creator of YOLO Joseph Redmon announced he was not pursuing computer vision due negative impact of his work )

https://arxiv.org/abs/2004.10934

As per the authors

Compared with the previous YOLOv3, YOLOv4 has the following advantages:

It is an efficient and powerful object detection model that enables anyone with a 1080 Ti or 2080 Ti GPU to train a super fast and accurate object detector.

The influence of state-of-the-art “Bag-of-Freebies” and “Bag-of-Specials” object detection methods during detector training has been verified.

The modified state-of-the-art methods, including CBN (Cross-iteration batch normalization), PAN (Path aggregation network), etc., are now more efficient and suitable for single GPU training.

Plug able Architecture

Bag of freebies (Bof) & Bag of specials (BoS)

Improvements can be made in the training process (like data augmentation, class imbalance, cost function, soft labeling etc…) to advance accuracy. These improvements have no impact on inference speed and called “bag of freebies”. Then, there are “bag of specials” which impacts the inference time slightly with a good return in performance. These improvements include the increase of the receptive field, the use of attention, feature integration like skip-connections & FPN, and post-processing like non-maximum suppression. In this article, we will discuss how the feature extractor and the neck are designed as well as all these Bof and BoS goodies.

Methodology for meeting speed in Neural Network in Production & Optimization for Parallel Computing:

Selection of BoF and BoS on a General Sense

For improving the any object detection training, a typical CNN usually uses the following:

Mosaic Data Augmentation

Details of YOLOv4

Spatial Pyramid Pooling layer
PAN (Path aggregation network)https://arxiv.org/abs/1803.01534

YOLO v4 uses:

!!! Curious to deep dive in each of above hyper parameters, please read through https://medium.com/@jonathan_hui/yolov4-c9901eaa8e61 (you will love it.)

Comparison of YOLOv4 on Different NVIDIA GPU Architectures (Maxwell,Pascal,Volta)

Final Thoughts from Authors:

A state-of-the-art detector which is faster (FPS) and more accurate (MS COCO AP50…95 and AP50) than all available alternative detectors. The detector described can be trained and used on a conventional GPU with 8–16 GB-VRAM this makes its broad use possible

Code and Walk through:(Fork the code)

https://www.youtube.com/watch?v=mKAEGSxwOAY (Credits to him for Code and Video)

Keep Learning !!!

References:

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Susant Achary

Written by

ML Engineer | Researcher | ROS practitioner | Xploring Econometrics || https://github.com/SSusantAchary?tab=repositories

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Susant Achary

Written by

ML Engineer | Researcher | ROS practitioner | Xploring Econometrics || https://github.com/SSusantAchary?tab=repositories

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store