Venkata DikshitEmergent Abilities in Large Language ModelsLanguage models, particularly those in the GPT family, have experienced a fascinating evolution over the past six years. This progression…6 min read·Jun 9, 2023----
Venkata DikshitFrom GPT to GPT-4: Tracing the Transformative Journey of GPTAs an ML practitioner, I have witnessed the evolution of GPT over the past five years. GPT marked an intriguing paradigm shift when it was…6 min read·Apr 23, 2023--1--1
Venkata DikshitinAnalytics VidhyaGPT-3: Whats/Hows/WhereWhen I first heard about GPT-3, my first impression was that it must be GPT-2 + more compute + more data. This isn’t a bad expectation…7 min read·Jul 28, 2020----
Venkata DikshitinETHER Labs[Part-2] Which Attention(architecture) do you need?Overview of recent advances in Transformer architectures for NLP tasks7 min read·Aug 13, 2019----
Venkata DikshitinETHER Labs[Part-1] Which Attention(architecture) do you need?Overview of recent advances in Transformer architectures for NLP tasks6 min read·Aug 5, 2019----
Venkata DikshitinETHER LabsBERT for unsupervised text tasksThis post discusses how we use BERT and similar self-attention architectures to address various text crunching tasks at Ether Labs.6 min read·Jul 18, 2019--1--1