Understanding key stages of RAG with ‘SODHPUCHH(सोधपुछ):A query engine built with Llamaindex’ (Part…

2 min readMay 19, 2024

Understanding key stages of RAG with ‘SODHPUCHH(सोधपुछ):A query engine built with Llamaindex’ (Part — III)

This article is a continuation of the previous articles (Part- I & Part- II) where we discussed basics of RAG, loading, indexing and storing data. Today, we will describe querying in detail with code examples.

Querying simply means asking questions and getting relevant answers. Examples of querying can be:

Question Answer
Request for summarization
Reasoning loop across multiple components
Repeated prompt + LLM calls

Querying involves:

Retrieval: find and return the most relevant documents for our query from the Index.
Postprocessing: Nodes retrieved are optionally reranked, transformed, or filtered, for instance by requiring that they have specific metadata such as keywords attached.
Response synthesis: Our query, the most-relevant data and our prompt are combined and sent to the LLM to return a response.

Let’s see the following code for more clarity:

for file_title, node in law_docs.items():
    # Define query engines
    vector_query_engine = vector_index.as_query_engine(llm=Settings.llm)
    summary_query_engine = summary_index.as_query_engine(llm=Settings.llm)

    # Define query engine tools
    query_engine_tools = [
        QueryEngineTool(
            query_engine=vector_query_engine,
            metadata=ToolMetadata(
                name="vector_tool",
                description=(
                    f"Useful for questions related to specific laws or rules and regulations related to {file_title}."
                ),
            ),
        ),
        QueryEngineTool(
            query_engine=summary_query_engine,
            metadata=ToolMetadata(
                name="summary_tool",
                description=(
                    f"Useful for any requests that require a holistic summary of everything about {file_title}."
                ),
            ),
        ),
    ]

In llama index, QueryEngine is regarded as the basis of all querying. ‘vector_index.as_query_engine()’ is the built in method to get our index to create a query engine.

The codes mentioned in this series of articles is incomplete. You may want to see the full code through this link. Besides query engine, you may encounter things like query engine tools and agents. Please see the llama index docs to learn more about them.

For now let’s see how our query engine answered our questions:))

I asked the procedure to make a passport and it correctly answered the query.

2. I also asked it the role of home minister.

SODHPUCHH is not perfect yet and may require further prompt engineering or other solutions. Fell free to give me recommendations. I hope you enjoyed this series.

Written by Prakash Chaulagain