Emergent Abilities in LLM : Unpredictable Abilities in Large Language Models

Deepak Babu P R
3 min readJun 5, 2023

--

Emergent abilities refer to the capabilities that arise spontaneously from the complex interactions of simpler components. They are properties that can’t be predicted solely based on the individual parts, but only become apparent when these parts start to interact as a whole. This fascinating concept has its roots in various fields such as biology, physics, sociology, and more recently, artificial intelligence (AI).

Defining Emergent Abilities

The term “emergent” is derived from the Latin word “emergere,” meaning “to rise out” or “to come forth.” Emergent abilities, therefore, are those capabilities that ‘come forth’ or ‘rise out’ from a system. They are not explicitly programmed or designed into the system but instead arise organically from the interactions of the system’s components.

To better understand this, imagine a flock of birds. Each bird follows simple rules: stay close to your neighbors, avoid collisions, and move in the same direction. Yet, when these individual behaviors combine, they create a mesmerizing, dynamic pattern known as murmuration. This collective behavior is an emergent property of the system, as it cannot be predicted or explained by examining a single bird’s actions.

In the context of LLM (Large Language Models), emergent abilities refer to the unexpected behaviors that artificial models exhibit when they interact with their environment or when they are trained with large amounts of data. For instance, GPT-3, a language model developed by OpenAI, has shown an impressive ability to generate human-like text that was not explicitly programmed into it. This ability is ’emergent’ from the model’s training on a diverse range of internet text. Reasoning capabilities, low-resource language translation, math problem solving are considered to be emergent properties of LLMs. Jason Wei maintains an excellent blog of 137 emergent abilities in LLMs. Mathematical abilities have been seen to be non-existent in models up to 13B parameters, similarly reasoning abilities are known to be present in 100B+ parameter models. If you are someone who is comfortable reading scientific papers, highly recommend giving this paper a read (from the same author -Jason Wei of Google)

How do emergent abilities like reasoning arise ?

This is an important question to make a step change in deploying LLMs in production-grade systems at scale. Today, reasoning abilities are seen in 100B+ model parameter size which works for most offline scenarios that are not sensitive to latency. The billion dollar question is “Can we have emergent abilities like reasoning in much smaller models 7B, 13B models that are more practical for deployment in online latency-sensitive use-cases?”

The reason for emergent abilities is believed to be in scaling i.e bigger datasets and larger parameters. However, there is a recent line of research that finds empirical evidence of coding tasks being linked to LLMs reasoning ability. Yao Fu argues coding tasks involve decomposing complex problems (OOP) into objects that can be roughly mapped to high-level problem solving + coding requires keeping track of state ex: opening and closing braces that provide attending to longer context windows which together gives the ability to reason about the world. As of this writing, most of the research and developer community have thought about coding LLMs to be separate from general purpose LLMs. If this insight does hold true, we could be generating smaller LLMs with similar reasoning abilities as that of GPT3’s and 4’s of the world.

Nevertheless, studying origins of emergent abilities is an active area of research in NLP or LLM. Unlocking this insight has major repercussions in how we will adapt LLMs in production-grade systems (analogy to physics — much like importance of solving superconductivity at room temperature)

I would like to conclude this post by saying, while reasoning is not the sole emergent ability. It is believed to be one of the key abilities that will unlock AGI in LLMs. Feel free to share your thoughts and comments on this topic.

--

--

Deepak Babu P R

Principal Scientist | ML/AI, NLP, IR and speech | love travelling, reading, trekking and photography. https://prdeepakbabu.github.io/