Stanislav FedotovinNebiusHow transformers, RNNs and SSMs are more alike than you thinkBy uncovering surprising links between seemingly unrelated LLM architectures, a way might be paved for effective idea exchange and boosting…Sep 6Sep 6
Stanislav FedotovinNebiusMixtures of Experts and scaling lawsMixture of Experts (MoE) has become popular as an efficiency-boosting architectural component for LLMs. In this blog post, we’ll explore…Aug 13Aug 13
Stanislav FedotovinNebiusFundamentals of LoRA and low-rank fine-tuningIn the next installment of our series of deep technical articles on AI research, let’s switch our attention to the famous LoRA, a low-rank…Jun 17Jun 17
Stanislav FedotovinNebiusTransformer alternatives in 2024With this article, we are starting a new category on our blog, the one dedicated to AI research. Expect these posts to be very technical…Apr 41Apr 41