Complex Query Resolution through LlamaIndex Utilizing Recursive Retrieval, Document Agents, and Sub Question Query Decomposition

Harnessing the Power of LlamaIndex to Navigate Complex Queries through Recursive Retrieval, Specialized Document Agents, and Sub Question Query Engines for Comprehensive Answer Synthesis

13 min readOct 14, 2023

Imagine navigating through the vast ocean of information, seeking answers to complex questions that are not just multifaceted but also demand a nuanced understanding of various interconnected topics. The current landscape of Question Answering (QA) systems, while advanced, often deals with the intricate task of comprehensively addressing queries that require a synthesis of information from multiple sources or documents. Traditional QA systems leverage components such as the Sub Question Query Engines and the Recursive Retrieval with Document Agents. However, each when applied separately exhibit distinct limitations. While the Sub Question Query Engine adeptly handles multi-faceted queries by breaking them down, it may lack depth in exploring each sub-query, especially within interconnected or relational data. Conversely, the Recursive Retrieval with Document Agents excels in deep-diving into specific documents and retrieving detailed answers but may falter when tasked with managing and synthesizing responses from multiple, varied sub-queries. This presents a notable gap in addressing queries that are both multifaceted and demand detailed, interconnected insights.

Complex Query Resolution through LlamaIndex Utilizing Recursive Retrieval, Document Agents, and Sub Question Query Decomposition. Image by author.

Addressing this gap, as seen above, we introduce a system that amalgamates the prowess of LlamaIndex’s Recursive Retrieval with Document Agents, and Sub Question Query Decomposition to adeptly navigate through, and make decisions over, heterogeneous documents. This system not only retrieves pertinent information but also breaks down complex queries into manageable sub-questions, delegates them to specialized document agents equipped with query engine tools such as vector query engine, summary query engine, and knowledge graph query engine, and synthesizes the retrieved information into a coherent, comprehensive response. By doing so, it transcends the limitations of existing systems, offering a more refined and contextually rich answer to multifaceted questions, thereby elevating the user experience in interacting with AI-driven QA systems.

QA systems differences. Image by author.

Recursive Retriever + Document Agents

Recursive Retriever instead of finding and using small, isolated pieces of information/text chunks to answer queries, it aims to use summaries of entire documents, providing answers that are not only accurate but also maintain a better understanding and representation of the overall context or topic being discussed or queried. Document Agents are designed to dynamically perform tasks beyond mere fact-based question-answering within a document. These agents have access to various query engine tools for a given document, enabling them to navigate and retrieve information in a more targeted and specific manner. When dealing with multiple documents, Document Agents alone might not suffice to route traffic appropriately, necessitating an additional layer for routing to the right document agent and, subsequently, the right document index. This is where index nodes come into play, serving as an additional layer that sits in front of the document agents, effectively acting as a “table of contents” for document agents and ensuring that queries are routed to the most relevant document agent, and thereby, the most pertinent document index. In this project, the document agents will have access to the vector query engine, summary query engine, and the knowledge graph query engine.

Sub Question Query Engine

The Sub Question Query Engine aims at deconstructing a complex query into more manageable sub-questions, each tailored to extract specific information from relevant data sources. It operates by first discerning the various elements within a complex query, subsequently generating sub-questions that are then dispatched to appropriate data sources for resolution. Once the intermediate responses are garnered, they are synthesized into a coherent, unified response that adeptly addresses the original, multifaceted query. The Sub Question Query Engine enhances our existing setup by managing complex queries that require insights from multiple documents or data sources. While the existing architecture, with Recursive Retriever and Document Agents, effectively handles the retrieval and initial understanding of information within documents, the Sub Question Query Engine steps in when the query necessitates a more nuanced analysis, especially across various documents or contexts. It ensures that the final response is not only accurate but also enriched with context, thereby amplifying the system’s ability to adeptly respond to intricate, complex, and multi-source queries.

Query Engine Tools

Query engine is a generic interface that allows you to ask question over your data. A query engine takes in a natural language query, and returns a rich response. Three distinct query engines are constructed to manage different aspects of information retrieval. These query engines are nothing but tools which are provided to the LLM Agent to perform search and retrieval operations. The vector_query_engine is derived from a VectorStoreIndex, focusing on efficiently retrieving relevant document sections based on vector similarity. The list_query_engine, sourced from a SummaryIndex, emphasizes fetching summarized information, ensuring concise and relevant data extraction. Lastly, the graph_query_engine, originating from a KnowledgeGraphIndex, is efficient at extracting structured, interconnected, and relational knowledge

Implementation

For a comprehensive guide, please follow these 2 articles — Recursive Retriever + Document Agents, and Sub Question Query Engine as they provide detailed information.

Refer to my GitHub repo for the complete Jupyter notebook.

Let’s start by installing the dependencies and importing the necessary libraries.

%pip install llama-index pinecone-client transformers neo4j python-dotenv

import os
import torch
import pinecone
from transformers import pipeline
from llama_index import (
    VectorStoreIndex,
    SummaryIndex,
    KnowledgeGraphIndex,
    SimpleKeywordTableIndex,
    SimpleDirectoryReader,
    ServiceContext,
    StorageContext
)
from dotenv import load_dotenv
from llama_index.schema import IndexNode
from llama_index.tools import QueryEngineTool, ToolMetadata
from llama_index.llms import OpenAI
from llama_index.query_engine import SubQuestionQueryEngine
from llama_index.retrievers import RecursiveRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.response_synthesizers import get_response_synthesizer
from llama_index.vector_stores import PineconeVectorStore
from llama_index.graph_stores import Neo4jGraphStore

Data Preparation

We will work with 3 Wikipedia pages. Extract the data, store it, and finally load it for further processing.

wiki_titles = ["Seattle", "Boston", "Chicago"]

from pathlib import Path

import requests

for title in wiki_titles:
    response = requests.get(
        "https://en.wikipedia.org/w/api.php",
        params={
            "action": "query",
            "format": "json",
            "titles": title,
            "prop": "extracts",
            # 'exintro': True,
            "explaintext": True,
        },
    ).json()
    page = next(iter(response["query"]["pages"].values()))
    wiki_text = page["extract"]

    data_path = Path("data")
    if not data_path.exists():
        Path.mkdir(data_path)

    with open(data_path / f"{title}.txt", "w") as fp:
        fp.write(wiki_text)

# Load all wiki documents
city_docs = {}
for wiki_title in wiki_titles:
    city_docs[wiki_title] = SimpleDirectoryReader(
        input_files=[f"data/{wiki_title}.txt"]
    ).load_data()

load_dotenv()
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
service_context = ServiceContext.from_defaults(llm=llm)

Build Document Agent for each Document

Now we define document agents for each document.

Before that we define — a vector index (for semantic search), summary index (for summarization), and a graph index (for structural semantic search) for each document. These 3 query engines are then converted into tools that are passed to an OpenAI function calling agent.

This document agent can dynamically choose to perform semantic search over vector index or graph index or summarization within a given document.

We create a separate document agent for each city.

Vector Storage Context

We create a Pinecone vector index with specified parameters like dimension and metric. And a vector storage context is established, utilizing Pinecone’s vector store, to manage and facilitate the efficient storage and retrieval of the vector index data within the LlamaIndex framework.

# init pinecone
os.environ["PINECONE_API_KEY"] = os.getenv('PINECONE_API_KEY')
os.environ["PINECONE_ENVIRONMENT"] = os.getenv('PINECONE_ENVIRONMENT')
pinecone.init(api_key=os.environ["PINECONE_API_KEY"], environment=os.environ["PINECONE_ENVIRONMENT"])
pinecone.create_index("vector-index", dimension=1536, metric="euclidean", pod_type="p1")

# construct vector store and customize storage context
vector_storage_context = StorageContext.from_defaults(
    vector_store=PineconeVectorStore(pinecone.Index("vector-index"))
)

Graph Storage Context

In this section, we build a knowledge graph from scratch using Relation Extraction By End-to-end Language generation (REBEL), LlamaIndex, and Neo4j. REBEL is a relation extraction model which uses a BART model to convert raw sentences into relation triplets. We essentially construct a knowledge graph from unstructured data for efficient granular knowledge retrieval. Lastly, we utilize Neo4j’s graph store, to manage and facilitate the efficient storage and retrieval of the graph data within the LlamaIndex framework.

Knowledge graph of the input wikipedia data. Image by author.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

triplet_extractor = pipeline(
    'text2text-generation',
    model='Babelscape/rebel-large',
    tokenizer='Babelscape/rebel-large',
    device=device)

import re

def clean_triplets(input_text, triplets):
    """Sometimes the model hallucinates, so we filter out entities
       not present in the text"""
    text = input_text.lower()
    clean_triplets = []
    for triplet in triplets:

        if (triplet["head"] == triplet["tail"]):
            continue

        head_match = re.search(
            r'\b' + re.escape(triplet["head"].lower()) + r'\b', text)
        if head_match:
            head_index = head_match.start()
        else:
            head_index = text.find(triplet["head"].lower())

        tail_match = re.search(
            r'\b' + re.escape(triplet["tail"].lower()) + r'\b', text)
        if tail_match:
            tail_index = tail_match.start()
        else:
            tail_index = text.find(triplet["tail"].lower())

        if ((head_index == -1) or (tail_index == -1)):
            continue

        clean_triplets.append((triplet["head"], triplet["type"], triplet["tail"]))

    return clean_triplets

def extract_triplets(input_text):
    text = triplet_extractor.tokenizer.batch_decode([triplet_extractor(input_text, return_tensors=True, return_text=False)[0]["generated_token_ids"]])[0]

    triplets = []
    relation, subject, relation, object_ = '', '', '', ''
    text = text.strip()
    current = 'x'
    for token in text.replace("<s>", "").replace("<pad>", "").replace("</s>", "").split():
        if token == "<triplet>":
            current = 't'
            if relation != '':
                triplets.append({'head': subject.strip(), 'type': relation.strip(),'tail': object_.strip()})
                relation = ''
            subject = ''
        elif token == "<subj>":
            current = 's'
            if relation != '':
                triplets.append({'head': subject.strip(), 'type': relation.strip(),'tail': object_.strip()})
            object_ = ''
        elif token == "<obj>":
            current = 'o'
            relation = ''
        else:
            if current == 't':
                subject += ' ' + token
            elif current == 's':
                object_ += ' ' + token
            elif current == 'o':
                relation += ' ' + token

    if subject != '' and relation != '' and object_ != '':
        triplets.append({'head': subject.strip(), 'type': relation.strip(), 'tail':object_.strip()})
    clean = clean_triplets(input_text, triplets)
    return clean

os.environ["NEO4J_URI"] = os.getenv('NEO4J_URI')
os.environ["NEO4J_USERNAME"] = os.getenv('NEO4J_USERNAME')
os.environ["NEO4J_PASSWORD"] = os.getenv('NEO4J_PASSWORD')
os.environ["NEO4J_DB"] = os.getenv('NEO4J_DB')

graph_store = Neo4jGraphStore(
    username=os.environ["NEO4J_USERNAME"],
    password=os.environ["NEO4J_PASSWORD"],
    url=os.environ["NEO4J_URI"],
    database=os.environ["NEO4J_DB"],
)

graph_storage_context = StorageContext.from_defaults(graph_store=graph_store)

from llama_index.agent import OpenAIAgent

# Build agents dictionary
agents = {}

for wiki_title in wiki_titles:
    # build vector index
    vector_index = VectorStoreIndex.from_documents(
        city_docs[wiki_title], service_context=service_context, storage_context=vector_storage_context
    )
    # build summary index
    summary_index = SummaryIndex.from_documents(
        city_docs[wiki_title], service_context=service_context
    )
    # build graph index
    graph_index = KnowledgeGraphIndex.from_documents(
        city_docs[wiki_title],
        storage_context=graph_storage_context,
        kg_triplet_extract_fn=extract_triplets,
        service_context=ServiceContext.from_defaults(llm=llm, chunk_size=256)
    )
    # define query engines
    vector_query_engine = vector_index.as_query_engine()
    list_query_engine = summary_index.as_query_engine()
    graph_query_engine = graph_index.as_query_engine()

    # define tools
    query_engine_tools = [
        QueryEngineTool(
            query_engine=vector_query_engine,
            metadata=ToolMetadata(
                name="vector_tool",
                description=f"Useful for retrieving specific context from {wiki_title}",
            ),
        ),
        QueryEngineTool(
            query_engine=list_query_engine,
            metadata=ToolMetadata(
                name="summary_tool",
                description=f"Useful for summarization questions related to {wiki_title}",
            ),
        ),
        QueryEngineTool(
            query_engine=graph_query_engine,
            metadata=ToolMetadata(
                name="graph_tool",
                description=f"Useful for retrieving structural, interconnected and relational knowledge related to {wiki_title}",
            ),
        ),
    ]

    # build agent
    function_llm = OpenAI(model="gpt-3.5-turbo-0613")
    agent = OpenAIAgent.from_tools(
        query_engine_tools,
        llm=function_llm,
        verbose=True,
    )

    agents[wiki_title] = agent

Build Recursive Retriever over these Agents

A set of summary nodes is established, each correlating to a respective Wikipedia city article. Subsequently, a RecursiveRetriever is configured in front these nodes, orchestrating the routing of queries to an appropriate node. This node, in turn, directs the query to the pertinent document agent, ensuring a structured pathway for query navigation and retrieval within the system.

# define top-level nodes
nodes = []
for wiki_title in wiki_titles:
    wiki_summary = (
        f"This content contains Wikipedia articles about {wiki_title}. "
        f"Use this index if you need to lookup specific facts about {wiki_title}.\n"
        "Do not use this index if you want to analyze multiple cities."
    )
    node = IndexNode(text=wiki_summary, index_id=wiki_title)
    nodes.append(node)

# define top-level retriever
top_vector_index = VectorStoreIndex(nodes)
vector_retriever = top_vector_index.as_retriever(similarity_top_k=1)

# define recursive retriever
recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": vector_retriever},
    query_engine_dict=agents,
    verbose=True,
)

response_synthesizer = get_response_synthesizer(
    response_mode="compact",
)
retriever_query_engine = RetrieverQueryEngine.from_args(
    recursive_retriever,
    response_synthesizer=response_synthesizer,
    service_context=service_context,
)

Setup sub question query engine

In this pivotal step, the recursive retriever, encapsulating the document agent and various query engines, is transformed into a tool. This tool, characterized by its ability to access documents through recursive retrieval, is subsequently integrated into the SubQuestionQueryEngine. This ensures that the engine not only inherits the capabilities of the recursive retriever but also leverages its functionalities to effectively decompose and navigate through complex, multi-faceted queries, providing a structured and efficient pathway for extracting and synthesizing information from multiple documents.

# convert the recursive retriever into a tool
query_engine_tools = [
    QueryEngineTool(
        query_engine=retriever_query_engine,
        metadata=ToolMetadata(
            name="recursive_retriever",
            description="Recursive retriever for accessing documents"
        ),
    ),
]

# setup sub question query engine
query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    service_context=service_context,
    use_async=True,
)

response = query_engine.query(
    "Tell me about the sports teams in Boston and the positive aspects of Seattle"
)
print(response)

Generated 2 sub questions.
[recursive_retriever] Q: What are the sports teams in Boston?
Retrieving with query id None: What are the sports teams in Boston?
Retrieved node with id, entering: Boston
Retrieving with query id Boston: What are the sports teams in Boston?
=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "sports teams in Boston"
}
Got output: Boston has teams in the four major North American men's professional sports leagues, which are Major League Baseball, the National Football League, the National Basketball Association, and the National Hockey League. Additionally, Boston has a team in Major League Soccer.
========================
Got response: The sports teams in Boston include:

1. Boston Red Sox: The Boston Red Sox are a professional baseball team and a member of Major League Baseball (MLB). They play their home games at Fenway Park, which is the oldest ballpark in MLB.

2. New England Patriots: The New England Patriots are a professional football team and a member of the National Football League (NFL). They play their home games at Gillette Stadium in Foxborough, Massachusetts.

3. Boston Celtics: The Boston Celtics are a professional basketball team and a member of the National Basketball Association (NBA). They play their home games at TD Garden.

4. Boston Bruins: The Boston Bruins are a professional ice hockey team and a member of the National Hockey League (NHL). They also play their home games at TD Garden.

5. New England Revolution: The New England Revolution is a professional soccer team and a member of Major League Soccer (MLS). They play their home games at Gillette Stadium.

These are the major sports teams in Boston, representing baseball, football, basketball, hockey, and soccer.
[recursive_retriever] Q: What are the positive aspects of Seattle?
Retrieving with query id None: What are the positive aspects of Seattle?
Retrieved node with id, entering: Seattle
Retrieving with query id Seattle: What are the positive aspects of Seattle?
=== Calling Function ===
Calling function: summary_tool with args: {
"input": "positive aspects of Seattle"
}
Got output: Seattle offers a vibrant urban environment with a thriving economy, especially in the technology sector, and is home to major companies like Microsoft and Amazon. The city also has a rich cultural scene, with a strong musical heritage in jazz and rock music. Additionally, Seattle is surrounded by breathtaking natural landscapes, including the stunning Puget Sound, Lake Washington, and the majestic Cascade Range.
========================
Got response: Some positive aspects of Seattle include:

1. Thriving Economy: Seattle is known for its strong economy, particularly in the technology sector. It is home to major companies like Microsoft, Amazon, and Boeing, providing numerous job opportunities and contributing to the city's prosperity.

2. Cultural Scene: Seattle has a vibrant cultural scene with a rich musical heritage. The city has been a hub for jazz and rock music, producing famous artists like Jimi Hendrix and Nirvana. It also has a thriving theater, art, and film community, offering a diverse range of cultural experiences.

3. Natural Beauty: Seattle is surrounded by stunning natural landscapes. The city is located on the shores of the Puget Sound, offering beautiful waterfront views and opportunities for water activities. It is also close to Lake Washington and the Cascade Range, providing ample opportunities for outdoor recreation, including hiking, skiing, and boating.

4. Green City: Seattle is known for its commitment to sustainability and environmental consciousness. The city has a strong focus on recycling, renewable energy, and green initiatives. It has a well-developed public transportation system and is bike-friendly, making it easy to navigate without a car.

5. Education and Research: Seattle is home to several prestigious universities and research institutions, including the University of Washington and Fred Hutchinson Cancer Research Center. These institutions contribute to the city's intellectual and scientific advancements and provide opportunities for education and research.

6. Food and Coffee Culture: Seattle is renowned for its food and coffee culture. The city is known for its diverse culinary scene, offering a wide range of international cuisines and fresh seafood. Seattle is also the birthplace of Starbucks and has a thriving coffee culture, with numerous local coffee shops and roasters.

These positive aspects make Seattle an attractive city to live in, work, and explore.
[recursive_retriever] A: The sports teams in Boston include the Boston Red Sox, New England Patriots, Boston Celtics, Boston Bruins, and New England Revolution.
[recursive_retriever] A: Seattle has a thriving economy, a vibrant cultural scene, stunning natural beauty, a commitment to sustainability, renowned educational and research institutions, and a diverse food and coffee culture. These positive aspects make Seattle an attractive city to live in, work, and explore.
The sports teams in Boston include the Boston Red Sox, New England Patriots, Boston Celtics, Boston Bruins, and New England Revolution. Seattle, on the other hand, has a thriving economy, a vibrant cultural scene, stunning natural beauty, a commitment to sustainability, renowned educational and research institutions, and a diverse food and coffee culture. These positive aspects make Seattle an attractive city to live in, work, and explore.

Key Takeaways

As we saw from the previous response, for the complex query “Tell me about the sports teams in Boston and the positive aspects of Seattle,” the system demonstrates its proficiency by dissecting it into two coherent sub-questions. Initially, it seeks to identify the sports teams in Boston and subsequently explores the positive aspects of Seattle. The Recursive Retriever selects the correct index node, corresponding to the Wikipedia page for each city (Boston and Seattle), ensuring that the responses are derived from contextually relevant sources. For the first sub-question, it navigates through the Boston node, while for the second, it explores the Seattle node. Information is then retrieved from the vector and summary indexes, providing detailed insights about the sports teams and positive aspects, respectively. The Document Agents, equipped with various query engines, ensure precise retrieval and management of information from the documents, while the Sub Question Query Engine synthesizes the responses, ensuring they are comprehensive and contextually rich.

Summary

Navigating through the intricacies of complex queries, the implemented system seamlessly integrates Recursive Retrieval, Document Agents, and Sub Question Query Decomposition to provide coherent and contextually enriched responses. The Recursive Retriever focuses on utilizing document summaries to maintain a robust understanding of the queried topic, while Document Agents ensure targeted information retrieval within documents. Furthermore, the Sub Question Query Engine skillfully manages multi-dimensional analysis across various documents, ensuring that responses are not only accurate but also offer a comprehensive view, effectively addressing complex, multi-source inquiries without compromising on the depth and quality of the information provided.