Venkata DikshitLoRA: The Underrated Key to Enterprise AI EfficiencyEnterprise AI: Progress and RoadblocksSep 21Sep 21
Venkata DikshitLoRA: The Underrated Key to Enterprise AI EfficiencyEnterprise AI: Progress and RoadblocksSep 19Sep 19
Venkata DikshitLlama3 is AwesomeLlama3 is crushing it, and guess what? It’s open source. The technical report they dropped has everything you need — every trick and…Aug 14Aug 14
Venkata DikshitEmergent Abilities in Large Language ModelsLanguage models, particularly those in the GPT family, have experienced a fascinating evolution over the past six years. This progression…Jun 9, 2023Jun 9, 2023
Venkata DikshitFrom GPT to GPT-4: Tracing the Transformative Journey of GPTAs an ML practitioner, I have witnessed the evolution of GPT over the past five years. GPT marked an intriguing paradigm shift when it was…Apr 23, 20231Apr 23, 20231
Venkata DikshitinAnalytics VidhyaGPT-3: Whats/Hows/WhereWhen I first heard about GPT-3, my first impression was that it must be GPT-2 + more compute + more data. This isn’t a bad expectation…Jul 28, 2020Jul 28, 2020
Venkata DikshitinETHER Labs[Part-2] Which Attention(architecture) do you need?Overview of recent advances in Transformer architectures for NLP tasksAug 13, 2019Aug 13, 2019
Venkata DikshitinETHER Labs[Part-1] Which Attention(architecture) do you need?Overview of recent advances in Transformer architectures for NLP tasksAug 5, 2019Aug 5, 2019
Venkata DikshitinETHER LabsBERT for unsupervised text tasksThis post discusses how we use BERT and similar self-attention architectures to address various text crunching tasks at Ether Labs.Jul 18, 20191Jul 18, 20191