Simulating human-like social behavior with ChatGPT

A short review of “Generative Agents: Interactive Simulacra of Human Behavior”

6 min readApr 13, 2023

Source: Park, Joon Sung, et al. “Generative Agents: Interactive Simulacra of Human Behavior.” (2023).

Researchers at Stanford and Google have created “generative agents” based on ChatGPT (gpt3.5-turbo) to mimic human behavior in a Sims-inspired sandbox. The outcome of their work was a social interaction simulator, where each agent can interact with each other and the environment performing various tasks just by using natural language.

This experiment is incredibly exciting, and I thoroughly enjoyed reading the paper. We have witnessed impressive applications of Large Language Models (LLMs) before, but once again, I was amazed at what they can achieve.

To see for yourself, you can take a look at a pre-computed replay of the simulation that accompanies the paper.

The simulator is powered by ChatGPT, which serves as the engine. As a result, each agent is described by natural language, which includes their profession, interests, and relationships with other agents. For example

John Lin is a pharmacy shopkeeper at the Willow Market and Pharmacy who loves to help people. He is always looking for ways to make the process of getting medication easier for his customers; John Lin is living with his wife, Mei Lin, who is a college professor, and son, Eddy Lin, who is a student studying music theory; John Lin loves his family very much; John Lin has known the old couple next-door, Sam Moore and Jennifer Moore, for a few years; John Lin thinks Sam Moore is a kind and nice man; John Lin knows his neighbor, Yuriko Yamamoto, well; John Lin knows of his neighbors, Tamara Taylor and Carmen Ortiz, but has not met them before; John Lin and Tom Moreno are colleagues at The Willows Market and Pharmacy; John Lin and Tom Moreno are friends and like to discuss local politics together; John Lin knows the Moreno family somewhat well — the husband Tom Moreno and the wife Jane Moreno.

To ensure that the actions of the agents are consistent over time, the authors extended ChatGPT with three external components. These components enable agents to establish a daily routine, respond to new events, and modify their plans if needed.

Memory stream

The first component, named the memory stream, comprises a long-term memory module and an information retrieval system. The long-term memory module serves as a database that stores records in the natural language of the agent’s past experiences and the state of the environment along with a timestamp of the event. In addition, they employ a language model to extract embeddings for each record, which are then saved for future processing.

As the memory stream may contain observations that are not relevant to the current situation of the agent, the authors propose three metrics to select the most relevant records.

Recency scores recent memory objects higher, making events from a moment ago more likely to remain in the agent’s attention. This is nothing but an exponential decay with a 0.99 factor assigned to each record based on sandbox game hours.
Importance: The authors prompt ChatGPT to assign an importance score between 0 and 10 to records in long-term memory, given the current situation. For example, it may give 2 for “cleaning up the room” and 8 for “asking your crush out on a date.”
Relevance scores memories related to the current situation. The authors tokenize the current state of an agent and calculate relevance using cosine similarity between the current state’s embedding and memory embeddings.

The scores then are normalized and combined, and the highest-ranked memories are included in the prompt to determine the agent’s actions at each time step in the action loop.

Reflection

Reflection is the second type of memory, which captures the agent’s abstract thoughts. I find this component particularly fascinating because it mimics the way the human brain operates.

Similar to how we process and store important information acquired during the day in our long-term memory during sleep, ChatGPT is asked to highlight the most important events for each agent it recently collected. Specifically, this happens several times during a game day, when the sum of the importance score of the latest events perceived by the agents exceeds a certain threshold.

When the importance scores of recent events exceed a certain threshold, the authors fetch the last 100 records from the memory stream and prompt ChatGPT with “Given only the information above, what are 3 most salient high-level questions we can answer about the subjects in the statements?”. The generated questions are then used as queries to retrieve relevant memories. After this, CahtGPT is asked to extract insights and cites particular records from the memory as evidence, which are then stored as reflections in the memory stream, including pointers to the cited objects.

Planning

The third component is planning, which translates records from the memory stream and the current environment state into a high-level plan of action. First, ChatGPT is asked to create a rough plan for the day using previous experience and the current state as the context for the prompt, and then recursively asked to add details for more realistic behavior.

Each plan consists of a location, a starting time, and a duration (e.g., for 180 minutes from 9am, February 12th, 2023, at Oak Hill College Dorm: Klaus Mueller’s room: desk, read and take notes for research paper). These plans are also recorded in long-term memory, which along with observation and reflection, allows the agent to decide how to behave.

At each step in the action loop, agents receive information from the environment, which is stored in their memory stream. Next, the authors generate a prompt using current observation as the context and ask ChatGPT to decide what the agent should do next. For example, an agent can modify its plan in response to unexpected observations from the environment, such as finding the bathroom occupied when it had intended to use it.

Environment

The Smallville sandbox world is a virtual world with different areas and objects inspired by The Sims, where each element has a text label. The agents move around this world, just like in a video game, and interact with it by taking actions. They remember the parts of the world they have seen, in the state they saw them through the memory stream. The agents also interact with each other and can have conversations in natural language. At each time step, they output a statement describing their current action, which is then translated into movements that affect the virtual world.

The proposed architecture allows the agents to simulate believable human behavior and beyond. The agents have the ability to respond to new situations and adjust their plans accordingly. They could even organize a Valentine’s Day party by spreading invitations over the next two days, making new friends, asking each other to attend the party, and coordinating to arrive at the event together at the appropriate time. All of this was accomplished without explicit instruction from the researchers.

The ability of the agents to interact and adapt to new situations in a believable way is impressive. The Valentine’s Day party scenario is a great example of the potential of this technology beyond what was initially intended. As this technology continues to evolve and advance, it will be exciting to see how it can be further developed and integrated into various industries and fields. The possibilities are endless, and we can expect to see more impressive applications of LLMs like ChatGPT in the future.