Ideas on improving state-of-the-art AI-Agent workforce architecture

Davide Wiest
The Modern Scientist
5 min readJan 27, 2024

AI-Agents have become increasingly interesting and competent in recent months. Take ChatDev, for example: Its agent workforce can create programs like Flappy Bird or an MsPaint clone, including project documentation.

However, there is room for improvement. I’ll walk you through 8 insights from brainstorming a different architecture. By the end of this article, you’ll understand this diagram in detail:

Agent-Based Architecture Diagram

The following ideas are only conceptual. An implementation can, and probably should, look different.

Provide direct feedback with a decoupled test-application layer

Testing work directly reduces problems later on. Direct feedback makes the problem-solution relationship clearer, which makes extracting a valuable insights easier. This enables the system as a whole to learn faster. Testing can be:

  • Running tests on a piece of code
  • Reviewing the output with a second agent
  • Sending the output to an API, which does the testing
  • Awaiting approval from a human

Separation of concerns: Agent as a Manager-Worker pair

Sub-Agent Part of Architecture Diagram

Every agent consists of a manager and a worker. The manager’s role is to analyze the task or problem and determine the best approach to solve it. The worker can fully focus on creating the solution. This reduces the quantity of information each subagent is exposed to, increasing effectiveness within their specific role.

Chat-Interface Interaction Diagram

What would this look like in a chat interface? The worker-subagent would communicate with the manager. The manager, on the other hand, responds to the feedback he gets from the worker’s output. Feedback could also include the worker’s parts of or the whole output, which would be useful for debugging. In most cases, it weakens the decoupling.

While this somewhat already exists, it requires two agents working together. One on its own is less autonomous. We’ll see later why this is important.

Public workforce knowledge system (“Workforce Database”)

Workforce Database Part of Diagram

This is the place where tasks, objectives, data from outside, and helpful knowledge live. It’s also the primary means of communication between agents. Each entry is labeled with metadata. (Author, type of information, possibly recipients, for tasks: importance, urgency etc.) Additionally, each entry should be structured and atomic. This can be done by defining templates.

Every agent’s chat-history will be evaluated, and info that’s relevant to one agent is copied into its memory. This has two benefits: Knowledge is permanent and global. If the information management is optimal, no mistake is made twice.

Side note — Tasks need special treatment: Most tasks depend on other tasks. Each task should only be completed once. An example of a “workforce database” is Notion.

Introducing agent-specific memory and separating it from the workforce knowledge system

Agent-Memory Part of Diagram

Agent-specific memory contains info related to the agents specific task and objectives. it can help the manager-subagent in understanding the problem and possible solutions.

It consists of mostly short-lived and very specific information, which is why it’s better to not include it in the workforce database by default. Entries come from either the output, or the workforce database.

After a task has been completed, the chat-history can be passed to a “review”-agent, that extracts potentially re-usable pieces of information and puts them into the workforce database:

Review-Process Diagram

Information Managers sorting knowledge and tasks by relevance

IMs are subagents too. They aren’t necessarily large language models.

The simplest approach would be to make a query based on an entries’ metadata. This will be sufficient for tasks, as their importance and urgency is mandatory metadata.

The same approach can be used when picking entries to move from the workforce database to the agent’s memory. All agents with the same role can use the same algorithm.

The next option filter by embedding-similarity, which RAG-application do. This will be sufficient for knowledge inside the agent’s memory.

Side note: How often an IM uses a piece of information determines if it can be discarded or not.

Independent agents make workflows asynchronous — Cooperation passively based on shared knowledge

By moving the coordination to the workforce database, and by grouping each manager with a worker, each agent is autonomous. This allows them to operate asynchronously and carry out their tasks without waiting for input or coordination from other agents, while still receiving it as early as possible.

As an example, a “CTO”-agent can inform all “Project manager”-agents about an important insight. This will be sent from the workforce database into their memories and can be used right away. A mechanism for preventing higher-level agents with less workload from drifting off and planning into the far future should be implemented.

Everything that may need to change is changeable

This includes fundamental variables in the system that 1) humans traditionally set up and 2) at the start of the process. By preventing the workforce from adapting to real-world feedback, we would limit the time until it has to be renewed manually.

The architecture should prevent the agents from drifting off not by prohibiting change. Instead, changes are reviewed and checked for alignment with the most fundamental objective. This can be accomplished with dedicated agents too.

Workforce System Change Diagram

To make this possible, everything that may change is stored in some form of accessible database. Examples of things that change:

  • Templates of workforce database entries
  • Allocation of roles (How many agents do what)
  • Less fundamental objectives
  • Strategy
  • Specific technologies used Things that don’t change:
  • The most fundamental objective
  • Which agents can alter the system

Permission System

Considering the previous point, a permission system is definitely required. Permissions should be bound to roles. Examples of permissions:

  • Select specific metadata options of an entry in the workforce database
  • Change templates
  • Change an ordinary agents’ role
  • Edit important workforce entries

More room for improvement: Some ideas

Here are some points this architecture does not handle. Most of it boils down to the nature of large language models (LLMs).

  • LLMs favor action over inaction
  • LLMs favor acceptance over denial
  • LLMs are not skeptical enough for quality critical thinking
  • Interaction with the real world is still very limited

--

--

Davide Wiest
The Modern Scientist

Programmer, Data/AI/QS Enthusiast, Student | Writing technical, knowledge management, PKM, productivity, abstract