Tutorial: Real-time YOLOv3 on a Laptop Using Sparse Quantization

Published in

Deep Sparse

1 min readMay 25, 2021

Sparsifying YOLOv3 (or any other model) involves removing redundant information from neural networks using algorithms such as pruning and quantization, among others. This sparsification process results in many benefits for deployment environments, including faster inference and smaller file sizes. Unfortunately, many have not realized the benefits due to the complicated process and number of hyperparameters involved.

Neural Magic’s ML team created recipes encoding the necessary hyperparameters and instructions to create highly accurate pruned and pruned-quantized YOLOv3 models to simplify the process. These recipes allow anyone to plug in their data and leverage SparseML’s recipe-driven approach on top of Ultralytics’ robust training pipelines.

The examples listed in this tutorial are all performed on the VOC dataset. Additionally, the results listed in this tutorial are available publicly through a Weights and Biases project.

See the complete tutorial on GitHub.

Originally published at https://neuralmagic.com on May 25, 2021.

Tutorial: Real-time YOLOv3 on a Laptop Using Sparse Quantization

Written by Neural Magic