Emergent capabilities in AI

8 min readMay 4, 2023

Introduction

In the dazzling world of technological marvels, Generative AI (GAI) has been stealing the spotlight, captivating our imaginations and sparking heated debates on its potential dangers to society and institutions. While these concerns are undoubtedly valid, there’s a hidden, possibly more ominous risk lurking beneath the surface that we should not ignore.

To truly appreciate the gravity of this under-discussed threat, let’s first dive into some high-level concepts that lay the groundwork for our exploration. Prepare yourself to embark on a journey into the intriguing realm of emergent capabilities and performance spikes, as we investigate the unforeseen consequences of ever-expanding and increasingly complex Large Language Models (LLMs).

Emergent behaviour

The enigmatic phenomenon of emergent behavior has long captivated scientists and researchers, as complex patterns or behaviors astonishingly materialize from the simple interactions between individual components, all without a central governing force. These unexpected occurrences aren’t pre-programmed or planned but instead arise from the collective actions of the independent elements. For instance, observe the mesmerizing flight patterns of birds in a flock, effortlessly changing direction in perfect harmony, all thanks to a few basic rules — stick close to neighbors and avoid collisions — without any single bird taking the lead.

Imagine the potential chaos as a programming bot, initially designed for a specific task, suddenly acquires the ability to access the internet and hack systems upon scaling up its parameters or computational power. The emergence of such unforeseen capabilities could be perilous. Moreover, some capabilities might surface entirely without warning, giving rise to qualitatively distinct functionalities even when models aren’t explicitly trained to possess them — a crucial stepping stone towards achieving general intelligence.

The fascinating world of Large Language Models (LLMs) or, more specifically, the class of foundational models, has demonstrated a striking relationship between scale and performance. In many instances, this relationship is so consistent that it follows a predictable, lawful pattern — a scaling law. More often than not, these scaling laws foretell the continuous growth of certain capabilities as models expand in size.

In Advances in Neural
IPredictability and Surprise in Large Generative Models, 3 Oct 2022, arXiv:2202.07785v2

Large generative models have a paradoxical combination of high predictability — model loss improves in relation to resources expended on training and tends to correlate loosely with improved performance on many tasks — and high unpredictability — specific model capabilities, inputs, and outputs can’t be predicted ahead of time. The former drives rapid development of such models while the latter makes it difficult to anticipate the consequences of their development and deployment

While it’s true that emergent abilities often surface at a specific scale, it’s crucial to recognize that model size isn’t the sole key to unlocking these hidden talents. As we delve deeper into the fascinating realm of training large language models, we may witness a paradigm shift, whereby smaller models, armed with innovative architectures, superior-quality data, or enhanced training methodologies, unlock capabilities previously thought to be exclusive to their larger counterparts.

In this ever-evolving landscape, it’s not only about reaching for the stars and expanding model size but also about refining the very essence of these models through groundbreaking advancements and strategic optimization. It’s this magical interplay of scale and innovation that continues to propel the field of AI to uncharted territory, unveiling emergent abilities and remarkable performance enhancements that defy expectations.

Examples

Arithmetic:

For arithmetic, GPT-3 displays a sharp capability transition somewhere between 6B parameters and 175B parameters, depending on the operation and the number of digits

In Advances in Neural
Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901.
https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

Notice that the performance spike is abrupt and not a gradual process. Intuitively, abrupt scaling of a specific capability can co-exist with smooth general scaling for the same reason that daily weather is less predictable than seasonal averages: individual data points can vary much more than broad averages.

While large language models like GPT-3 were trained to understand and generate text, they were not explicitly taught mathematical operations. Nevertheless, these models have demonstrated the ability to perform arithmetic, such as three-digit addition, showcasing an emergent capability.

Analogical Reasoning

A particularly intriguing emergent capability is the potential of these models to solve new problems zero-shot, without explicit training. Human reasoning often relies on analogy, and Webb et al’s paper, “Emergent Analogical Reasoning in Large Language Models” (2023), revealed that GPT demonstrated a remarkable ability to induce abstract patterns, equalling or even exceeding human skills in most situations. These findings suggest that GPT-3 and similar large language models possess an innate capacity to discover zero-shot solutions to a wide variety of analogy-based challenges.

The researchers evaluated GPT’s performance using four distinct task domains, each targeting a different facet of analogical reasoning: 1) text-based matrix reasoning problems, 2) letter-string analogies, 3) four-term verbal analogies, and 4) story analogies. They compared the model’s performance to human behavior in each domain, analyzing both the overall results and error patterns.

Webb et al, *Emergent analogical reasoning in Large Language Models*, 2023.

The probability of random success in the two generative tasks (matrix reasoning and letter-string analogies) is nearly zero, owing to the vast array of potential generative responses. Black error bars display the standard error of the mean for the average performance among participants. Individual dots signify the accuracy of each participant. Meanwhile, gray error bars denote the 95% binomial confidence intervals for the average performance across multiple problems.

Grokking

Grokking is a phenomenon by which emergent capabilities may result from scaling the number of optimization steps rather than model size.

Power et al, “GROKKING: GENERALIZATION BEYOND OVERFITTING ON SMALL ALGORITHMIC DATA”, 2022 show that in some situations neural networks learn through a process of “grokking” a pattern in the data, improving generalization performance from random chance level to perfect generalization, and that this improvement in generalization can happen well past the point of overfitting.

The dataset the considered is a simple binary operation table, where the neural network is trained on a subset of equations of the form:

Binary operation on a, b and c with no internal structure.

An example of a small binary operation table. We invite the reader to make their guesses as to which elements are missing.

Significantly after overfitting, validation accuracy occasionally experiences a sudden rise from chance level to near-perfect generalization, a phenomenon known as ‘grokking.’ Neural networks were able to generalize and fill in the empty spaces within various binary operation tables.

The red curves show training accuracy and the green ones show validation accuracy. Training accuracy becomes close to perfect at

Neural networks usually experience a phase transition, characterized by a sudden shift toward generalization and a sharp increase in validation accuracy. This occurs after the initial phase of memorizing and overfitting to the training data.

Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. — Center for Research on Foundation Models (CRFM) Stanford Institute for Human-Centered Artificial Intelligence (HAI) Stanford University

Managing Risks

The mesmerizing dance between smooth, general capability scaling and the abrupt emergence of specific abilities can give rise to unforeseen safety concerns, often discovered only after a model has been developed and set loose in the wild. To navigate this complex terrain, we must develop a robust framework that strikes a balance between innovation and risk mitigation.

Enter the Model Governance and Risk Framework, which distinguishes between “soft governance” for low-risk applications and “hard governance” for high-risk scenarios. By establishing clear roles and responsibilities throughout the entire AI lifecycle, with transparent reporting to the highest level of decision-making, we can draw inspiration from the 3 Lines of Defense frameworks used in banking to create scalable oversight and ensure AI remains safely in check.
Model Red Teaming calls for a proactive approach to uncovering potential pitfalls before deployment, by thoroughly examining the input and output spaces of AI models. Solutions may include static benchmarks, such as adversarial datasets probing weaknesses in computer vision systems, or risk-based human interaction and evaluation during system runtime. Organizations can further bolster their defenses by embracing initiatives like “bug bounty” programs or implementing automated red-teaming methods alongside manual exploration.
A wealth of tools is available for model evaluation, with a particular focus on comprehensive evaluations or searches for novel capabilities beyond fixed datasets measuring known abilities. For Large Language Models (LLMs), consider harnessing the power of Eleuther’s ‘Language Model Evaluation Harness’, the BIG-bench benchmark, or HuggingFace’s ‘BERTology’, among others.

Ultimately, a systematic empirical study of abrupt jumps in capabilities across research and commercial tasks for large models will shed light on the frequency and timing of these occurrences, enabling us to better understand, predict, and manage the fascinating world of emergent AI behavior.

Conclusion

As we venture into the future, a rapidly growing number of players will delve into the realm of ever-larger models, fueled by the tantalizing prospects of deploying these AI powerhouses. This surge in development is bound to usher in a new era of emergent capabilities, further amplified by the fusion of Large Language Models, Agent-Based Modeling, and Cooperative AI systems.

We now find ourselves on the precipice of a unique and extraordinary challenge — the emergence of unforeseen risks stemming from the very emergent capabilities that make these colossal models so captivating. It is a paradox we must grapple with as we strive to harness the potential of AI while simultaneously navigating its intricate web of hazards.

This uncharted territory begs us to contemplate a multitude of compelling questions: How can we effectively measure and mitigate these emergent risks? What new methodologies and frameworks must be developed to ensure AI remains a force for good, rather than an unwieldy beast? How can we strike the delicate balance between innovation and safety as we push the boundaries of AI’s capabilities?

The answers to these questions will determine the trajectory of our technological evolution and our ability to responsibly wield the power of AI. It is a journey that requires our collective wisdom, foresight, and dedication — one that will shape the course of human history and redefine our relationship with the digital world. And so, we must rise to the challenge, spurred by the understanding that unraveling the mysteries of emergent capabilities and their accompanying risks is not merely an intellectual endeavor but an imperative for the sustainable advancement of society.