Esperanto TechnologiesQuantization and Mixed-Mode Techniques for Small Language ModelsWritten by Adrien Sade, Eric SorianoDec 6
Ingrid StevensQuantization of LLMs with llama.cppUnderstanding and Implementing n-bit Quantization Techniques for Efficient Inference in LLMsMar 1510
Nithin DevanandWhat is Quantization in LLMLarge Language Models comes in all kinds of flavors. It has become larger in terms of number of parameters, more capable in learning and…Mar 161Mar 161
InSage AibyAhmed SalhinStop Losing Accuracy after LLM Quantization!How a Dequantize-First Approach Saves Your QLoRA Models after MergingDec 3Dec 3
InTowards Data SciencebyPierre LienhartThe AQLM Quantization Algorithm, ExplainedIn this blog post, we cover the AQLM quantization algorithm which sets a new state-of-the-art for compressing LLMs down to 2 bits!Mar 132Mar 132
Esperanto TechnologiesQuantization and Mixed-Mode Techniques for Small Language ModelsWritten by Adrien Sade, Eric SorianoDec 6
Ingrid StevensQuantization of LLMs with llama.cppUnderstanding and Implementing n-bit Quantization Techniques for Efficient Inference in LLMsMar 1510
Nithin DevanandWhat is Quantization in LLMLarge Language Models comes in all kinds of flavors. It has become larger in terms of number of parameters, more capable in learning and…Mar 161
InSage AibyAhmed SalhinStop Losing Accuracy after LLM Quantization!How a Dequantize-First Approach Saves Your QLoRA Models after MergingDec 3
InTowards Data SciencebyPierre LienhartThe AQLM Quantization Algorithm, ExplainedIn this blog post, we cover the AQLM quantization algorithm which sets a new state-of-the-art for compressing LLMs down to 2 bits!Mar 132
LM PoUnderstanding Quantization for LLMsAs large language models (LLMs) continue to grow in size and complexity, the need for efficient deployment and inference becomes…Jul 231
InTowards Data SciencebyDhruv MataniTensor Quantization: The Untold StoryA close look at the implementation details of quantization in machine learning frameworksSep 8, 20233