The Rapid Evolution of LLMs — Where we are and What’s next for AI

Soki D
4 min readOct 9, 2023

In recent years, the AI landscape has been defined by the swift development of large language models (LLMs). From GPT-3 to PaLM and beyond, LLMs have exploded in size and capabilities at a staggering pace. In this post, I’ll provide an overview of the major players in the LLM space, key milestones, and insights on where AI may be headed next.

Frontrunners in the LLM Race

OpenAI undoubtedly spearheaded the recent LLM boom with GPT-3 in 2020, their 175 billion parameter model that awed the world with its eerily human-like writing abilities. Despite not being commercially available, GPT-3 demonstrated AI’s potential at scale, sparking increased investment and competition in natural language processing.

Close on their heels is Anthropic, an AI safety startup, with their Claude model at 4.5 billion parameters. While smaller, Claude boasts significantly more advanced conversational abilities. Meanwhile, platforms like Cohere and Character.ai are trying to make LLMs more accessible to developers.

Most recently, Google and Anthropic collaborated to create PaLM with 540 billion parameters. PaLM points to LLMs’ future capabilities, displaying improved reasoning, common sense, and multi-modal skills. As models grow ever larger, retaining information gets harder — requiring advances like Anthropic’s Constitutional AI approach.

Funding Fuels LLM Growth

These leaps in model scale have been fueled by massive investments. In 2021 alone, over $1.7 billion of private funding flooded into AI startups across the US and Europe. OpenAI raised $1 billion from Microsoft in 2019, followed by $879 million from other backers. Anthropic, Cohere, and AI21 Labs have each received hundreds of millions in funding.

With AI now a strategic priority for tech giants like Google, Microsoft, Meta, and Baidu, the LLM arms race shows no signs of slowing down. For instance, Google recently announced its Pathways AI model at 1.6 trillion parameters — far surpassing GPT-3. As computing power increases and more data becomes available, models will achieve unprecedented size and intelligence.

Key Use Cases Emerge

Thus far, LLMs have proven adept at textual tasks like content generation, classification, summarization, and translation. GPT-3 demonstrated high proficiency at natural language tasks with zero-shot learning. Its API enabled new applications like content creation for developers.

Tools like Cohere’s API, GPT-3, DALL-E 2, and GitHub Copilot empower users to generate marketing copy, articles, code, and multimedia more efficiently. LLMs can also augment human capabilities in areas like drug discovery and computer programming.

However, concerns around bias, safety, and misuse of LLMs remain. The technology still lacks robust understanding of the world and its dynamics. Achieving advanced cognition while constraining potential harms will be critical as LLMs progress.

Use cases across different industries

Content Creation:

  • Tools like Jasper, Rytr, and Copy.ai allow you to generate marketing copy, emails, social media posts, and other content by integrating with LLMs like GPT-3. Useful for marketing teams and agencies.
  • Applications like Anthropic’s Claude, Writemore, and Murf provide an AI assistant for writing longer-form content like blog posts, articles, and speeches. Valuable for writers, founders, and public speakers.
  • Codex translates natural language to code through GitHub Copilot. Enables developers to automate coding by describing functions in plain language.

Customer Service:

  • Chatbots like Anthropic’s Constitutional AI, Ada, and Intercom leverage LLM abilities for more natural conversations that can handle customer questions and tasks. Helpful for support teams.
  • Tools like PolyAI and Cognigy integrate with LLMs to build intelligent voicebots for phone customer service. Useful for call centers and voice assistance.

Data Analytics:

  • LLM-powered apps like extract.ai and ZeroDash rapidly analyze and extract insights from documents, surveys, earnings calls, legal contracts etc. Saves analysts time.
  • Tools like Anthropic’s Claude and VARIYL offer a conversational interface to query data more intuitively using natural language. Valuable to data scientists.

Education:

  • Apps like Anthropic’s Constitutional AI, QuillBot and Rytr provide customized learning content, practice questions, and feedback for students. Assists teachers.
  • Intelligent tutoring systems like Carnegie Learning leverage LLMs to tailor teaching to individual students’ needs, improving outcomes.

As we can see, LLMs are enabling AI applications across diverse industries by powering better language understanding and generation. The capabilities of these models are only starting to be explored.

The Road Ahead

LLMs have rapidly advanced from narrow AI to more generalized intelligence in just a few years. While model size provided initial breakthroughs, we may see diminishing returns without new architectures. Future progress will likely require improved training techniques, multimodal capabilities, causal reasoning, and transfer learning.

Initiatives like Anthropic’s Constitutional AI aim to make models more transparent, safe, and beneficial to humans. Techniques like weight pruning and knowledge distillation can also optimize large models for deployment. In coming years, I expect LLMs will power conversational agents, analytics tools, and perhaps even robotics applications.

The LLM field is still nascent but developing swiftly. Sustained investment and rigorous research are propelling rapid iterations that take us closer to beneficial and broadly capable AI. There are challenges around ethics, safety, and access that must be considered as LLMs grow more advanced. Overall, it is an exciting time that highlights the transformative potential of AI done right.

--

--