Published inTDS ArchiveMachine Learning in Fraud Detection: A PrimerBalancing automation, accuracy, and customer experience in an ever-evolving adversarial landscapeNov 12, 20247Nov 12, 20247
Published inTDS ArchiveUser Action Sequence Modeling: From Attention to Transformers and BeyondThe quest to LLM-ify recommender systemsJul 15, 2024Jul 15, 2024
Published inTDS ArchiveLoRA: Revolutionizing Large Language Model Adaptation without Fine-TuningExploiting the low-rank nature of weight updates during fine-tuning results in orders of magnitude reduction in learnable parametersApr 23, 20241Apr 23, 20241
Published inTDS ArchiveDemystifying Mixtral of ExpertsMistral AI’s open-source Mixtral 8x7B model made a lot of waves — here’s what’s under the hoodMar 17, 20244Mar 17, 20244
Published inTDS ArchiveThe Rise of Sparse Mixtures of Experts: Switch TransformersA deep-dive into the technology that paved the way for the most capable LLMs in the industry todayFeb 15, 2024Feb 15, 2024
Published inTDS ArchivePushing the Limits of the Two-Tower ModelWhere the assumptions behind the two-tower model architecture break — and how to go beyondDec 10, 20233Dec 10, 20233
Published inTDS ArchiveTowards Understanding the Mixtures of Experts ModelNew research reveals what happens under the hood when we train MoE modelsNov 14, 2023Nov 14, 2023
Published inTDS ArchiveThe Rise of Two-Tower Models in Recommender SystemsA deep-dive into the latest technology used to debias ranking modelsOct 29, 20234Oct 29, 20234
Published inTDS ArchiveThe Multi-Task Optimization ControversyDo we need special algorithms to train models on multiple tasks at the same time?Sep 29, 20231Sep 29, 20231
Published inTDS ArchiveMachine Learning with Expert Models: A PrimerHow a decades-old idea enables training outrageously large neural networks todaySep 5, 20232Sep 5, 20232