Haseeb Ullah Khan ShinwariModel Compression: Shrinking Deep Learning Models Without Sacrificing PerformanceJul 15
Prajot KuvalekarA Brief Quantization Tutorial on Pytorch with CodeIn this tutorial, I will be explaining how to proceed with post-training static quantization, and in my upcoming blogs, I will be…Jan 242
Anh TuanIntroduction to QuantizationIn this post, I’ll introduce an overview of neural network quantization, one method for reducing the size of deep learning models.May 21May 21
David WilliamsinData Science at MicrosoftModel compression and optimization: Why think bigger when you can think smaller?Models don’t need to take so long to runSep 28, 20212Sep 28, 20212
SayantanmannaRun Powerful Open-Source Large Language Model with Just a Single 4GB GPU!In the era of large langauge models, efficiency in model performance and computational resource usage is crucial. Large Language Models…May 9May 9
Haseeb Ullah Khan ShinwariModel Compression: Shrinking Deep Learning Models Without Sacrificing PerformanceJul 15
Prajot KuvalekarA Brief Quantization Tutorial on Pytorch with CodeIn this tutorial, I will be explaining how to proceed with post-training static quantization, and in my upcoming blogs, I will be…Jan 242
Anh TuanIntroduction to QuantizationIn this post, I’ll introduce an overview of neural network quantization, one method for reducing the size of deep learning models.May 21
David WilliamsinData Science at MicrosoftModel compression and optimization: Why think bigger when you can think smaller?Models don’t need to take so long to runSep 28, 20212
SayantanmannaRun Powerful Open-Source Large Language Model with Just a Single 4GB GPU!In the era of large langauge models, efficiency in model performance and computational resource usage is crucial. Large Language Models…May 9
David DaleinTowards Data ScienceHow to adapt a multilingual T5 model for a single languageLoad embeddings only for the tokens from your language to reduce model sizeMay 4, 20217
Anish HilaryUnderstanding Eigenvalues and Eigenvectors: Enabling Deep Neural Network CompressionEigenvalues and Eigenvectors are fundamental concepts in linear algebra, widely utilized across diverse domains. In the context of deep…Apr 18