Advanced RAG and the 3 types of Recursive Retrieval

Chia Jeng Yang
WhyHow.AI
Published in
8 min readJan 31, 2024

In this article, we dive into the different ways to use Recursive or Iterative Retrieval to help develop self-learning RAG systems that are able to learn over time, and to deeply explore unstructured data, either autonomously, or in directions that you point the knowledge graph to.

“In recursive retrieval, a query is applied across smaller chunks of the corpus, intermediate results are recursively fed into the next steps, and aggregation happens to combine the outputs.” — Anthony Alcaraz, CAIO @ Fribl

In other words, you have an LLM agent that continuously scans for relevant sentences and paragraphs, retrieves them, then combines them and examines the answer. If there are any references to other concepts, or pages, or information that hints at the final answer, it retrieves information related to these second order concepts or information.

The implications of Recursive or Iterative Retrieval are tremendous.

  • Firstly, it can help provide consistent and exhaustive retrieval of a core idea that can act as a memory base for future reference.
  • Secondly, we can use it to perform multi-hop retrieval across multiple documents.
  • Thirdly, we can now confidently tie increased unstructured data input to a tangible increase in LLM accuracy.

In a multi-hop retrieval context, the seed node, and the relationships it needs to track are generated automatically by the LLM in response to a query, and this is essentially the way that an LLM can use a knowledge graph as a tool to tackle multi-hop queries. We are extremely excited about the opportunity for knowledge graphs to be created iteratively and ‘just-in-time’ to a specific question to act as a second brain within a RAG pipeline.

Expanding more broadly into how recursive retrieval can be done, we examine multiple different approaches to recursive retrieval that can be implemented. Here, there are three ways to think about it. Recursive retrieval can be classified into:

  1. Page-Based Recursive Retrieval
  2. Information-Centric Recursive Retrieval
  3. Concept-Centric Recursive Retrieval

In Page-Based Recursive Retrieval, we are looking for the next pages to perform subsequent retrieval on. In Information-Centric Recursive Retrieval, we are looking for the supporting concepts to perform subsequent retrieval on. In Concept-Centric Recursive Retrieval, we are simply looking to recursively add information to a single core concept.

Page-Based Recursive Retrieval

As the LLM is retrieving information and putting it into a knowledge graph, the LLM is instructed to track and dive deeper into the subsequent pages that are referenced. This is especially useful for highly technical manuals in manufacturing industries.

For example, if Page 1 cites a reference to Page 10 and 20, the agents will explore Page 10 and 20 to pull in relevant information to see if the question is answered. After pulling in Page 10 and 20, they will also examine the contents of those pages to determine if other reference pages should be explored. The agent then has a deterministic pathway, driven by page references, that allows it to take in more and more contextual information, examining a document or a book much like a human would.

For documents with clear structured references to other pages or exhibits, such as manuals or manufacturing documents, we can create RAG systems that are able to navigate a corpus of documents in seconds. If each page of the document only refers to one single topic, and not multiple topics, we can even save the relationships between pages into a document hierarchy that can be cached and referenced over time.

One limitation of this approach is if references to other page numbers are not present or not consistent. Consider a legal example — a document discussing liabilities that Client X is facing may acknowledge the presence of other liabilities discussed in other documents without referencing the document name or alternative pages. In this case, we would need a knowledge graph to be able to iteratively group concepts together across unstructured data.

An example of this recursive retrieval of pages approach was created by Hirishi of Greywing. In this demo, Hrishi uses reference pages cited in each document as nodes for points of future navigation. One unique feature of Hrishi’s domain is that since the questions pertain to a physical ship, measuring the accuracy of answers at each intermediate step is far easier since we can more easily measure progress to the right answer. This is because the relevant information for the ship is bound in the manual, and there is no outside context required.

Information-Centric Recursive Retrieval

In this Harry Potter demo we made, we perform a retrieval of information process. We fix the seed node/concept, which is a main character, that acts as the basis for exploration. We then allow the LLM to discover the key relationships to track between the character and other concepts/entities, based on what we are looking to understand. This affects the relationships that are tracked. If there are specific relationships we are looking to understand, we can manually amend the relationships ourselves.

Armed with an opinion of the relationships we are looking to focus on, as well as the seed node to center our exploration around, the LLM then looks through each chapter and populates the knowledge graph accordingly. With each chapter that is added, the iterative knowledge graph expands accordingly. The knowledge graph becomes directionally smarter over time, in the area of focus that has been defined, as more and more information is processed by the LLM.

Let’s consider another legal use-case If we want to focus on the liabilities that Client X may be facing, we set the seed node to be ‘Client X’. Then, as chunks are retrieved and run against Client X, the knowledge graph of liabilities relevant to Client X expands over time until the search is complete.

This is a recursive retrieval exercise for the purposes of looking up information relevant to a node. Look-Ups are a far easier technical process since we know exactly what we are looking for, and are simply ensuring that we do not miss any issues related to the referenced node.

The advantage of using an iterative knowledge graph over another data store is that, having done the work to structure the retrieved information for this particular question, the structured store of information can be re-used for any future related or adjacent queries and thus acts as a memory base for recursive or multi-hop retrieval.

We can employ the retrieval of information approach across time and across documents. Regarding retrieval over time, imagine your RAG system constantly having new updates to the vector database and new information flows in that you want to append to a historically existing answer. Regarding retrieval over different documents, imagine information related to the core concept of the question is located across many different chunks and you want to make sure it is all retrieved and structured automatically. This iterative knowledge graph acts as a contextually aware filter that constantly adds information to the relevant concept.

Concept-Centric Recursive Retrieval

There is another approach where a recursive retrieval exercise is done for the purposes of retrieval of concepts. This is where top-level nodes and n+2 nodes to be focused upon are decided by the LLM based on potential relevance to the question. The question here could be “Which clients are affected by Client X’s actions and how?” Given the framing of the questions, the LLM needs to look-up what Client X’s actions are, and then search what the second-order effects are, and who is impacted.

The nodes can therefore look like the following, with each traversement made as a result of making explicit the relationship between concepts.

Search is a harder process since the user typically does not know what they are specifically looking for, and are relying on the LLMs to probe each branch of the node to discover if they are getting closer to the correct answer.

The technical difficulties with this approach include:

  • Controlling the LLM to make sure that each node and N+1 node created are relevant to the question at hand. One can imagine that if the LLM decides to follow up on a node about unrelated information about Client X’s other activities, that the graph created would not be relevant or helpful.
  • This can be tackled in a few ways. Firstly, good prompting and multi-agent construction can be used to instruct the LLM to continually reference the nature of the question and to look at various related concepts such as Client X’s actions harming others, person(s) who have been harmed, and how they have been harmed. Secondly, expert qualitative feedback on what constitutes a good answer can be given in the form of a prompt. For example, a one-shot prompt can be included to show a model answer with the dimensions of the type of information that would constitute a good answer provided.
  • Understanding when the right answer is reached. In a search process, it can be unclear when sufficient information has been adequately retrieved, and if the existing formulated answer is 50% or 100% complete. It may be the case that a discovery of a n+4 node brings up information that contradicts the findings so far. An analogy to this would be how we as humans have the opportunity to perform Google Searches, but inevitably stop at Page 2 or 3 of Google Searches and call it a day.
  • This can be tackled with the aforementioned prompt engineering, or by using an LLM to help evaluate the answer.
  • This can also be tackled with a bit of pre-processing work. All relevant concepts can be mapped beforehand in a contextual dictionary, which we wrote about here. As a refresher, a contextual dictionary is a map that highlights which concepts can be found in which chunks. It is essentially the RAG equivalent of an old-fashioned index. Once all the relevant chunks within a concept have been queried, we can assume that the retrieval process is complete.
Old-Fashioned Index

WhyHow.AI is building tools to help developers bring more determinism and control to their RAG pipelines using graph structures. If you’re thinking about, in the process of, or have already incorporated knowledge graphs in RAG, we’d love to chat at team@whyhow.ai, or follow our newsletter at WhyHow.AI. Join our discussions about rules, determinism and knowledge graphs in RAG on our newly-created Discord.

Related Articles:

--

--