Samuel FlenderinTowards Data ScienceUser Action Sequence Modeling: From Attention to Transformers and BeyondThe quest to LLM-ify recommender systemsJul 15Jul 15
Samuel FlenderinTowards Data ScienceLoRA: Revolutionizing Large Language Model Adaptation without Fine-TuningExploiting the low-rank nature of weight updates during fine-tuning results in orders of magnitude reduction in learnable parametersApr 231Apr 231
Samuel FlenderinTowards Data ScienceDemystifying Mixtral of ExpertsMistral AI’s open-source Mixtral 8x7B model made a lot of waves — here’s what’s under the hoodMar 174Mar 174
Samuel FlenderinTowards Data ScienceThe Rise of Sparse Mixtures of Experts: Switch TransformersA deep-dive into the technology that paved the way for the most capable LLMs in the industry todayFeb 15Feb 15
Samuel FlenderinTowards Data SciencePushing the Limits of the Two-Tower ModelWhere the assumptions behind the two-tower model architecture break — and how to go beyondDec 10, 20233Dec 10, 20233
Samuel FlenderinTowards Data ScienceTowards Understanding the Mixtures of Experts ModelNew research reveals what happens under the hood when we train MoE modelsNov 14, 2023Nov 14, 2023
Samuel FlenderinTowards Data ScienceThe Rise of Two-Tower Models in Recommender SystemsA deep-dive into the latest technology used to debias ranking modelsOct 29, 20234Oct 29, 20234
Samuel FlenderinTowards Data ScienceThe Multi-Task Optimization ControversyDo we need special algorithms to train models on multiple tasks at the same time?Sep 29, 20231Sep 29, 20231
Samuel FlenderinTowards Data ScienceMachine Learning with Expert Models: A PrimerHow a decades-old idea enables training outrageously large neural networks todaySep 5, 20232Sep 5, 20232
Samuel FlenderinTowards Data ScienceMulti-Task Learning in Recommender Systems: A PrimerThe science and engineering behind algorithms that try to do it allJul 25, 2023Jul 25, 2023