The most insightful stories about Model Compression

Model Compression

Topic

7 Followers

87 Stories

Recommended stories

Haseeb Ullah Khan Shinwari
Model Compression: Shrinking Deep Learning Models Without Sacrificing Performance
Jul 15
Prajot Kuvalekar
A Brief Quantization Tutorial on Pytorch with Code
In this tutorial, I will be explaining how to proceed with post-training static quantization, and in my upcoming blogs, I will be…
Jan 24
2
Anh Tuan
Introduction to QuantizationIn this post, I’ll introduce an overview of neural network quantization, one method for reducing the size of deep learning models.
May 21
May 21
David Williams
in
Data Science at Microsoft
Model compression and optimization: Why think bigger when you can think smaller?Models don’t need to take so long to run
Sep 28, 2021
2
Sep 28, 2021
2
Sayantanmanna
Run Powerful Open-Source Large Language Model with Just a Single 4GB GPU!In the era of large langauge models, efficiency in model performance and computational resource usage is crucial. Large Language Models…
May 9
May 9

Haseeb Ullah Khan Shinwari

Model Compression: Shrinking Deep Learning Models Without Sacrificing Performance

Jul 15

A Brief Quantization Tutorial on Pytorch with Code

Prajot Kuvalekar

A Brief Quantization Tutorial on Pytorch with Code

In this tutorial, I will be explaining how to proceed with post-training static quantization, and in my upcoming blogs, I will be…

Jan 24

Anh Tuan

Introduction to Quantization

In this post, I’ll introduce an overview of neural network quantization, one method for reducing the size of deep learning models.

May 21

David Williams
in
Data Science at Microsoft

Model compression and optimization: Why think bigger when you can think smaller?

Models don’t need to take so long to run

Sep 28, 2021

Run Powerful Open-Source Large Language Model with Just a Single 4GB GPU!

Sayantanmanna

Run Powerful Open-Source Large Language Model with Just a Single 4GB GPU!

In the era of large langauge models, efficiency in model performance and computational resource usage is crucial. Large Language Models…

May 9

David Dale
in
Towards Data Science

How to adapt a multilingual T5 model for a single language

Load embeddings only for the tokens from your language to reduce model size

May 4, 2021

Understanding Eigenvalues and Eigenvectors: Enabling Deep Neural Network Compression

Anish Hilary

Understanding Eigenvalues and Eigenvectors: Enabling Deep Neural Network Compression

Eigenvalues and Eigenvectors are fundamental concepts in linear algebra, widely utilized across diverse domains. In the context of deep…

Apr 18

Georgian
in
Georgian Impact Blog

Compressing Wav2vec 2.0

By Zilun Peng and Akshay Budhkar

Mar 24, 2021

See more recommended stories