Asking deeper questions: using a knowledge graph for AI agent orchestration
Knowledge graphs have been making the headlines lately, overwhelmingly as a way to improve the retrieval of relevant context to add to an LLM prompt; another popular narrative is the usefulness of LLMs in creating knowledge graphs in the first place, again to be used for information retrieval, with or without LLMs.
Here we will tell a different story, namely how to use a knowledge graph not just for retrieval, but for information exchange between agents, as well as for dynamic task creation. We will show how this can be useful for an in-depth exploration of a domain.
Answering complicated questions
Consider your classic Retrieval-Augmented Generation (RAG) pattern, the subject of countless tutorials over the last year or so: this assumes a large knowledge base sliced into little pieces and stored in a database. When you want to use an LLM to answer a question about something that wasn’t in its training data, you use some math (embedding similarity) to retrieve a handful of knowledge base pieces that look most similar to the question, and feed those to the LLM along with the question; the LLM then does its best to fish the answer to the question from those chunks of text, and usually calls it a day.
But what if that answer tells you that in order to answer the original question, you need to look for yet more data? Consider how a typical literature research progresses: you start with an article or two that look relevant, and map out the concepts they describe, and the other articles they refer to. You pull up those other articles, and repeat the process until no important new concepts or articles come up. Another example is creating a feature matrix for solutions to a given problem: you search the web for the problem description, then read the descriptions of the top handful of solutions to collate a list of the key features that they mention, then re-read the descriptions (and maybe search some more) to draw up a feature matrix of the solutions you came across against the list of key features you just compiled.
A key feature of such problems is that you first have to use the retrieved information to build up a sort of a mental map of the domain, and then use that mental map to create the answer you were looking for. Classic RAG is not great at this, and neither are basic AI agents. Ask ChatGPT to draw you a feature matrix of, say, open source observability solutions, and the most you’re likely to get back is some half-digested rephrasing of two or three web pages that came up in its first web search.
How can we do better?
What would be the best way to store that mental map, so that it can be created on the fly and queried when needed? Knowledge graphs are a logical answer to that. They can contain entities of different types, with any properties you specify, and different types of relationships between those entities, again with their own properties. While the same data can theoretically be represented in a relational database, thinking of it as a graph makes querying it much more natural — and in the LLM age, that matters even more than before, as query patterns that a human would find simple are likely to also be those that an LLM would find it easy to generate.
An important requirement here is to populate that mental map dynamically, introducing new graph nodes (such as new key concepts) based on what has just been learned/retrieved. This is not how knowledge graphs were traditionally used: the common pattern was to assume the knowledge graph has “all” the relevant information about entities and their relationships, and just query it for those. In our opinion, this is the main reason knowledge graphs never really took off in the pre-LLM era: populating such a complete graph from unstructured information (e.g. from the web) was very hard with the techniques we had back then. With LLMs, this becomes easier, as they are great at extracting relationships from unstructured text; but still, generating a complete map of all the entities and their relationships for any non-toy domain is very slow and expensive.
So what we need is a way to make an LLM, either through a direct prompt or an agent wrapper, to recursively build the knowledge graph of the domain we care about, with just the entities and relationships relevant to our question, on the fly.
MotleyCrew knowledge-graph driven orchestration
This is exactly what MotleyCrew’s graph-driven orchestration achieves: each Task object has a method that allows it to be queried for task units (units of work that can be given to an agent). When that method is called, the Task object queries the knowledge graph to identify any tasks that might need doing, and returns them to the orchestrator, along with the specific kind of agent that this task unit can be given to.
This pattern is very powerful, because in executing those task units, the agents can in turn modify the knowledge graph, thus creating further task units that will be executed later, by the same Task or a different one. A run is complete when no Task can find any more task units to execute.
Example: a simple recursive research agent
Here is a simple example of how that might work (the application was inspired by this blog post which however didn’t use knowledge graphs or dynamic agent dispatch): you start with the original question, and do retrieval for it in the usual RAG way, storing the question as the first knowledge graph node and attaching the retrieved data to it.
But instead of trying to directly answer the question with the information you just retrieved, you instead ask the LLM: “Given the information retrieved just now, what other questions should I answer to get the best possible answer to the original question?”. You then insert these questions into the graph as children of the original question; then you ask the LLM to select the most relevant one of the questions that you haven’t yet done retrieval for, and apply the same procedure to it.
You thus build up a tree of questions, all created on the fly based on the information just retrieved, with a view to their relevance to the original question.
When you decide that the investigation has proceeded far enough (in the simplest case, by just setting a limit on the number of questions you do retrieval for), it’s time for the second phase. In this phase, you roll the tree back up, by first answering the questions furthest out in the tree (that have no children with retrieved information); then you answer their parent questions, by using not only their retrieved context, but also the answers you have just generated for the child questions. This way you work your way back up the tree, until you answer the original question, using the answers to all the child questions you also did retrieval for.
The example notebook shows how this is used to answer a question about the Mahabharata, a vast Indian epic.
Conclusion
While basic LLM calls, or simple agents wrapping them, can do many useful tasks, they are not sufficient to answer questions which first require acquiring a broader understanding of the relevant domain.
One way to deal with such questions is to dynamically generate a knowledge graph mapping out the relevant concepts and questions, and then use that to answer the original one. MotleyCrew provides a simple and natural way to do that.