Kevin FrançoisinneoxiaLLm infini-attention with linear complexityIntroducing Google’s Infini-attention to increase LLM attention windows and reduce quadratic complexity12 min read·Apr 26, 2024----
Kevin FrançoisinneoxiaDeep dive in embeddingsFull explanation of word and image embedding with the explanation of reference models such as Bert and Clip15 min read·Mar 21, 2024----
Kevin FrançoisinneoxiaMixtral 8x7B explainedThis review aims to expound upon the MoE framework and explore the added benefits that Mixtral 8x7B brings to these specific fie7 min read·Mar 21, 2024----
Kevin FrançoisinneoxiaProxy-Tuning: A Breakthrough in Customizing Large Language ModelsFine-tune LLM through next token probability4 min read·Mar 18, 2024----
Kevin FrançoisinneoxiaThe Era of 1-bit LLMsIntroduction: Deep dive in LLM quantization7 min read·Mar 13, 2024----