Is tool-calling all you need? Interaction patterns in multi-agent systems. Part III: Existing Agent Orchestration methods dissimilar to agents-as-tools.
The previous part of this series went over some agent orchestration methods that appear different to agents-as-tools, but are in fact equivalent. Today we’ll discuss two other approaches that are quite different.
The first one is that of Langgraph. The idea of LangGraph is quite simple and elegant: the user defines a directed graph, where each node is a possible state of the system — a classic state machine. The full state of the graph when it runs consists of firstly, the knowledge which (exactly one) node is active, and secondly, an arbitrary Python data object that the nodes can modify as it travels through the nodes. Each node contains the logic (implemented as a Langchain Runnable) that determines which of the out-going edges from the current node will be taken to the next node, which then becomes the active node.
Langgraph currently positions itself as an advanced way to create the state machine of an individual agent, but as the graph nodes can just as easily themselves be agents, it can also be used for multi-agent orchestration. The graph nodes can themselves also be Langgraph graphs, so arbitrarily complex constructions can be achieved. In particular, a pattern with multi-layer agents-as-tools that we discussed in earlier blogs of the series can be represented as such a graph, whereas the converse is not true: one can come up with sufficiently complex graphs that can’t be represented using the tool-calling pattern.
However, in a way the power of Langgraph is also a weakness. In our opinion, a perfect framework not only makes arbitrary complex things possible, but also makes common tasks really easy. That is why, for example, motleycrew has added the output handler argument to the common AgentExecutor implementations, to make it easy to impose hard constraints on an agent’s output by supplying just one special tool to it. The same logic could of course be implemented using Langgraph, but that would require creativity and effort that can be better used elsewhere.
Another limitation of the state-machine formalism is that only one node is active at the time, which can limit parallelization. Langgraph can still be useful for the cases when non-trivial state machines are really needed (and as all motleycrew agents and tools implement the Langchain Runnable interface, all of them can be used with Langgraph), but we wouldn’t recommend it as the default orchestration method.
The other really interesting orchestration method is that of MetaGPT. They implement something that’s called a Standard Operating Procedure for determining the relationships between some agents’ output and the others’ inputs. This can appear superficially similar to the Sequential process in CrewAI which we discussed in the previous piece of this series, but is actually more different and more powerful.
In the Sequential process, the information flows strictly down the dependency graph between the tasks (by the way, MetaGPT doesn’t seem to distinguish explicitly between tasks and agents nominated to do them), not dissimilar to how a call stack works. On the other hand, in MetaGPT the information can flow both ways, so for example a design document can be modified if problems arise with implementing it further down the line.
However, as far as we could tell, that Standard Operating Procedure was not formalized as an explicit graph, but was rather implicit in the relationships between the different agents.
Another useful orchestration features of MetaGPT are a group chat and pub-sub functionality. Often, it’s not enough to exchange information merely through the call stack (such as an agent passing a request to a tool and getting a response back). It can be useful to also get information about other parts of the overall system (for example, in motleycrew’s multistep research agent, it’s good to know all the questions asked earlier, in order to best determine what question to ask next). For that purpose, MetaGPT defines a group chat that all agents can post to, and agents can either consume all of it or subscribe to only receive certain kinds of messages from certain recipients (for example, agents writing code can subscribe to changes in design documents).
This is, in our opinion, the key missing piece for agent orchestration beyond agents-as-tools: a global memory store that agents can do specific reads and writes to. We’ll discuss that idea in more detail in the next part of the series, along with explaining how we chose to implement it in motleycrew.