MahtabSparse Transformers Explained | Part 1Capturing long-range dependencies in texts/audio/images requires a larger context length. Sparse Transformers¹ reduces the computation…6 min read·1 day ago--
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, GPT-3: Language Models are Few-Shot LearnersEfficiently scaling GPT from large to titanic magnitudes within the meta-learning framework8 min read·Feb 16, 2024--1
Mark RiedlA Very Gentle Introduction to Large Language Models without the Hype1. Introduction38 min read·Apr 14, 2023--122--122
David ShapiroA Pro’s Guide to Finetuning LLMsLarge language models (LLMs) like GPT-3 and Llama have shown immense promise for natural language generation. With sufficient data and…12 min read·Sep 23, 2023--8--8
Pradeep MenonIntroduction to Large Language Models and the Transformer ArchitectureChatGPT is making waves worldwide, attracting over 1 million users in record time. As a CTO for startups, I discuss this revolutionary…7 min read·Mar 9, 2023--1--1
MahtabSparse Transformers Explained | Part 1Capturing long-range dependencies in texts/audio/images requires a larger context length. Sparse Transformers¹ reduces the computation…6 min read·1 day ago--
Vyacheslav EfimovinTowards Data ScienceLarge Language Models, GPT-3: Language Models are Few-Shot LearnersEfficiently scaling GPT from large to titanic magnitudes within the meta-learning framework8 min read·Feb 16, 2024--1
Mark RiedlA Very Gentle Introduction to Large Language Models without the Hype1. Introduction38 min read·Apr 14, 2023--122
David ShapiroA Pro’s Guide to Finetuning LLMsLarge language models (LLMs) like GPT-3 and Llama have shown immense promise for natural language generation. With sufficient data and…12 min read·Sep 23, 2023--8
Pradeep MenonIntroduction to Large Language Models and the Transformer ArchitectureChatGPT is making waves worldwide, attracting over 1 million users in record time. As a CTO for startups, I discuss this revolutionary…7 min read·Mar 9, 2023--1
Duncan AndersoninBarnacle LabsBeyond Data Hoovering: The Nuanced Reality of Training Large Language Models (LLMs)Training Large Language Models (LLMs) is an evolving science. In this post I set out to shed some light on what’s involved.20 min read·Jul 19, 2023--1
Mick VleeshouwerHow to create a private ChatGPT with your own dataLearn the architecture and data requirements needed to create your own Q&A engine with ChatGPT/LLMs.9 min read·Mar 27, 2023--28