Stanford Smallville — LLMs and Group Behavior Simulations

Alexandre Robicquet
7 min readAug 12, 2023

--

Another perfect read for the weekend from a collaboration between Stanford and Google: “Generative Agents: Interactive Simulacra of Human Behavior.”

I couldn’t resist writing on this one because if you know me and my work for the past ten years, AI and human behavior all at once?

How can I miss this…

What is the setup?

Here is a list of the key elements of the experimental setup for evaluating the generative agents:

  • Sandbox Environment: A sprite-based sandbox game environment called Smallville, with areas and objects like houses, stores, cafes.
  • Agents: 25 unique agents with textual descriptions, sprite avatars, ability to move and interact.
  • Agent Architecture: Implemented using ChatGPT, with memory stream, retrieval, planning, and reflection components.
  • Agent-World Interactions: Agents can move around, interact with objects, and communicate with other agents.
  • User Interactions: Users communicate ONLY via natural language by taking on personas. Can also modify object states. This means that appearances are not in the equation “yet”.
  • Simulation Configuration: Seeded agent memories and relationships, ran for 2 simulated days.
  • Evaluation Methods: Controlled interviews to assess individual behaviors, and open-ended simulation to observe emergent social dynamics.
  • Measurements: Information diffusion, relationship formation, and coordination for a party; assessed via interviews.
  • Conditions Compared: Full architecture versus ablations of memory/planning/reflection. And human crowd worker condition.
  • Human Evaluators: 100 participants on Prolific evaluated agent believability from replays.
  • Analysis: TrueSkill ranking, statistical tests on rankings, inductive qualitative coding of agent responses.

This covers the critical elements of the sandbox world, generative agent capabilities, interaction mechanisms, simulation configuration, evaluation procedures, and analysis methods used to assess the agents’ behaviors.

The setup enabled rigorous testing of the architecture’s impact on believable agent behaviors.

Ok, and what happened?

In the bustling metropolis of Smallville (okay, it’s a tiny town, but let’s not split hairs), 25 quirky AI agents went about their business over a couple of frenetic simulated days.

Relationships blossomed in the most unexpected of ways. Take Sam Moore, who became fast friends with Latoya Williams during a casual stroll in the park.

Later on, ever the gentleman, Sam recalled their chat and prodded about Latoya’s budding photography escapades.

And oh, the town gossip! Like when Tom Moreno caught wind of Sam’s audacious plans to run for mayor, all while grabbing some kale chips at the local grocery store and wasn’t shy when it comes to expressing how he feels about him:

Note from the paper: “We note that the conversational style of these agents can feel overly formal, likely a result of instruction tuning in the underlying models. We expect that the writing style will be better controllable in future language models.”

Some agents even played party planner: Maria Lopez, upon hearing about Isabella Rodriguez’s grand Valentine’s Day bash at the local cafe, promptly roped in her buddy Klaus Mueller to the shindig.

While just a snapshot, these shenanigans showcase the wild potential of generative agent tech to craft convincingly human-like AI tales, all from a sprinkle of initial input. Gotta love, Smallville!

So.. are we living in a simulation?

Midjourney — /imagine a stark visual of a sleek, interconnected, and promising AI utopia on the left. To the right, showcase a contrasting scene of potential AI missteps: fragmented, shadowy nodes, and data breaches represented by cracks and fissures.” — Variations (Strong)

Ah, you got me; this was the provocative question at the end of this article. (no, it’s not, but let’s get this out of the way now).

The idea of full simulation and the study of simulated realities has been a topic of interest for researchers, philosophers, and scientists for a long time.

One notable work in the context of simulation theory is Nick Bostrom’s 2003 paper, “Are You Living in a Computer Simulation?

This seminal paper introduced the idea that future civilizations might possess the computational prowess to run detailed simulations of their ancestors, presenting arguments for the likelihood of our current experience being a part of such a simulation.

Bostrom’s work mainly focused on the “philosophical” implications and the probability aspects of living in a simulated reality, grounding it in three possible trilemma scenarios.

This new paper on “Generative Agents: Interactive Simulacra of Human Behavior” takes a different approach.

Instead of focusing on simulations' philosophical or probabilistic aspects, it dives into the technical aspects, aiming to create AI agents that can convincingly mimic human behavior in interactive environments.

The key distinction is in the application: Bostrom’s work leans towards the larger scope of existence and reality, while the newer paper seeks practical applications, like creating more believable NPCs in video games or simulating human interactions for various purposes.

While Bostrom questioned the nature of our reality, this paper seeks to craft a more nuanced and interactive simulated reality using LLMs.

TLDR: Maybe we are in a simulation, but this paper is certainly not a proof for it. Of course the agents don’t know they are in a simulation. They’re language models.

And this is what is really cool.

The Technology Behind a “Real” Life

The study aimed to craft generative agents that simulate human behavior in interactive environments. Influenced by long-term memories and experiences, these agents surpass traditional AI models that produce one-off responses. Key takeaways from the architecture include:

  • Memory Stream: Records experiences in a natural language format.
  • Retrieval Mechanism: Surfaces pertinent memories when required.
  • Planning/Reflection Modules: Synthesize memories into higher-level inferences that steer behavior.

Interestingly, these agents were tested in a Sims-style sandbox world, where they established relationships, disseminated information, and coordinated activities based purely on initial user input.

Some reflective Insights:

  • The seamless integration of a natural language stream for memory logging and retrieval, and reflection mechanisms marks a significant evolution in AI interactivity.
  • Observing the complex social dynamics born out of AI’s advancements is genuinely fascinating. While the foundation of predicting collective behaviors, such as crowd dynamics, has existed through principles like fluid mechanics, introducing unique, self-adjusting behaviors brings a fresh perspective.
  • Beyond gaming, the vast potential applications and ethical considerations highlight a future filled with promise and complexity.

How does this help our world?

Well, this could actually lead to breakthroughs in numerous industries or applications.

In industries that rely heavily on team dynamics, like the corporate world, finances, or even filmmaking, these agents can help in predicting team interactions or audience responses.

What is most exciting is not just the ability of these agents to mimic behaviors but their potential to model the intricacies and emergent phenomena of group interactions.

By understanding the nuances of such interplays, we’re taking a monumental step towards creating tools and solutions that are tailored to the real world’s complexity.

Psychology and Behavioral Science: This advancement could revolutionize therapy and behavioral research. For example, simulated environments populated with these AI agents can recreate complex human interactions and social contexts. Researchers and therapists can then analyze or intervene in these simulated scenarios to gain insight into human psychology and behavior.

Security and Law Enforcement: Crime prediction and understanding escalation processes are intricate because they often involve multiple actors with varied motivations. Generative agents can simulate diverse scenarios, from crowd behaviors to potential reactions from individual lawbreakers or groups. This would enable more effective policing strategies and responsive crime prevention.

Urban Planning & Infrastructure: As cities grow, the dynamics between residents, transportation nodes, utilities, and public spaces become increasingly complex. Generative agents could simulate the day-to-day actions of a diverse populace, allowing for better design and planning of urban spaces.

Financial Markets: Markets are driven by the collective behaviors of individual traders, institutional investors, and regulatory bodies. Simulating these interactions can give insights into potential market reactions to different economic scenarios or policy changes.

Disaster Response: When emergencies strike, effective response requires coordination among various agencies, volunteers, and affected populations. Generative agents would help model and predict potential challenges, optimizing response strategies.

Ethical and Practical Concerns

However, with innovation comes responsibility. The creation of AI agents believing in their existence within a simulated environment presents a slew of ethical dilemmas. Not that this is the case here, considering we are talking about LLMs, but how would we address the potential emotional distress of an AI realizing its nature?

Moreover, ensuring these agents don’t reinforce stereotypes or harbor biases is challenging. Without careful consideration and refinement, these systems might perpetuate harmful behavior.

A Reality Check

Perhaps most provocatively, these agents had no inkling that they were within a simulation. This raises profound philosophical questions for us. If AI can be unaware of its true nature, what does it say about our understanding of reality? Might we, too, be participants in some advanced civilization’s grand simulation?

In the end, while “Generative Agents: Interactive Simulacra of Human Behavior” offers a fascinating leap in AI capabilities, it also invites us to introspect — about technology, ethics, and the very nature of existence.

So — Do Androids Dream?

--

--