Tutorial: Real-time YOLOv3 on a Laptop Using Sparse Quantization

Neural Magic
Deep Sparse
Published in
1 min readMay 25, 2021
YOLOv3 on a Laptop Example

Sparsifying YOLOv3 (or any other model) involves removing redundant information from neural networks using algorithms such as pruning and quantization, among others. This sparsification process results in many benefits for deployment environments, including faster inference and smaller file sizes. Unfortunately, many have not realized the benefits due to the complicated process and number of hyperparameters involved.

Neural Magic’s ML team created recipes encoding the necessary hyperparameters and instructions to create highly accurate pruned and pruned-quantized YOLOv3 models to simplify the process. These recipes allow anyone to plug in their data and leverage SparseML’s recipe-driven approach on top of Ultralytics’ robust training pipelines.

The examples listed in this tutorial are all performed on the VOC dataset. Additionally, the results listed in this tutorial are available publicly through a Weights and Biases project.

See the complete tutorial on GitHub.

Originally published at https://neuralmagic.com on May 25, 2021.

--

--

Neural Magic
Deep Sparse

Optimize your DL models with ease. Run on CPUs at GPU speeds. The future of #deeplearning is sparse.