【課程筆記】Building Agentic RAG with Llamaindex — Part 1： Router Query Engine

Ting

14 min readMay 26, 2024

課程資訊

課程名稱：Building Agentic RAG with Llamaindex

講師：Jerry Liu(llama_index的co-founder/CEO)

課程網站：DeeplearningAI 吳恩達創立的教學網站

課程連結：https://www.deeplearning.ai/short-courses/building-agentic-rag-with-llamaindex/

課程費用：免費

課程簡介：

LlamaIndex的聯合創始人兼CEO Jerry Liu教學如何使用Llama-index建立agentic RAG。

Part1: Router Query Engine（本篇筆記）

Part2: Tool Calling

Part3: Building an Agent Reasoning Loop

Part4: Building a Multi-Document Agent

Part1: Router Query Engine筆記

概述

構建最簡單形式的agentic RAG — 一個Router。當進行Query時，Router將從問答(Q&A)或摘要(summarization)中選擇一個查詢引擎來對單個文件執行查詢。

DeepLearning.AI有在課程提供執行的環境和免費的OpenAI API。但為了之後可以在自己系統上使用，下面的Code筆記是我轉寫到local電腦端執行的結果。

設定

pip安裝llama-index

!pip install llama-index

設定OpenAI API Token(要官網付費)

import os
os.environ['OPENAI_API_KEY'] ="sk-XXXX" # 填入自己的API

很多module有使用asyncio。

import nest_asyncio

nest_asyncio.apply()

Load Data讀取資料

使用這篇量子光學的論文pdf當範例。下載到電腦資料夾後，貼上路徑位置。

from llama_index.core import SimpleDirectoryReader
# 讀取檔案
documents = SimpleDirectoryReader(input_files=["pdf-quantumoptics/1908.03034v1.pdf"]).load_data()

Documents是一個list。用預設的方式讀取PDF檔案後，一般是一頁的內容存在list中的一個儲存位置。如果看documents的長度，剛好就是這次使用的PDF檔案的25頁。

print(len(documents))
# 25

設定LLM和Embedding model

將上方documents切割成固定大小為1024的nodes（也稱chunk）。nodes是llamaindex處理資料的基本單位。

from llama_index.core.node_parser import SentenceSplitter

# 設定將Documents切成nodes的設定
splitter = SentenceSplitter(chunk_size=1024)
# 根據上方splitter的設定把documents切割成nodes
nodes = splitter.get_nodes_from_documents(documents)

設定LLM和embedding model。

from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

定義Summary Index和Vector Index

使用兩種index方法1.VectorStoreIndex和2. SummaryIndex和將node轉換到index空間。

VectorStoreIndex：是將上述document的內容轉換到embedding空間，根據使用者輸入Query的內容，會回傳特定的nodes/chunks給LLM作為回答的參考。要注意的是，VectorStoreIndex會使用上述embed_model來轉換到embedding空間，如果使用預設的OpenAI”text-embedding-ada-002"，是也需要付費的。
Summary_index：在index的過程中，將node照順序做儲存。在Query的過程會用特定的filter來iterate所有的node，從所有nodes中抽取出摘要(summary)的訊息。因此回傳的內容不太會因為輸入Query的內容而有所改變。

from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

定義Query Engines和設定其Metadata

將index轉換成query engine。因此可以進行根據document內容的回答。

summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True)

vector_query_engine = vector_index.as_query_engine()

設定Query Tool。Query tool只是將query engine加入meta data，透過在description的設定，提示在什麼情境要選用Summary還是vector的query engine。

from llama_index.core.tools import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to Quantum optics"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the Quantum optics paper."
    ),
)

定義Router Query Engine

有幾種不同類型的Selector。

LLM Selector：使用LLM對Query做語法分析和找出對應的index，並輸出JSON檔案
Pydantic selector：使用OpenAI的API產生Pydantic selection object（還不是很瞭解）

在這裡是使用第一類的LLMSingleSelector。

from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector


query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(), #將LLMSingleSelector設定為selector
    query_engine_tools=[ #將前述的兩個tool設定為待選擇的query engine
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

執行Query：摘要Summary的情況

要求query engine針對PDF檔進行摘要。

response = query_engine.query("What is the summary of the document?")
print(str(response))

輸出的內容

## 先顯示router根據Query的問題，選擇了summary tool的功能，選擇這個的簡單原因
Selecting query engine 0: The document is likely a summary of Quantum optics, 
making choice 1 the most relevant.. 


## 輸出的摘要內容
The document provides a comprehensive overview of the application of quantum 
states of light in advanced imaging techniques, covering topics such as the 
generation of entangled photons, quantum correlations in imaging, improved
 imaging with quantum light, superresolution in quantum imaging, new imaging 
techniques like ghost imaging and interaction-free measurements, and the
 outlook for quantum imaging technologies. It discusses experiments and 
theoretical progress in quantum optics, including quantum interference, 
quantum metrology, and quantum technologies, with a focus on areas such as 
interaction-free measurements, ghost imaging, quantum entanglement, and 
quantum-enhanced measurements. The work also explores applications of quantum
 optics in microscopy, lithography, and spectroscopy, highlighting 
advancements in quantum imaging technologies and their potential 
commercialization in the near future.

可以用response.source_nodes，來看這次回答用了哪些source_nodes。Summary engine會回傳所有的node(chunks)，所以source_nodes的數量就剛好等於輸入的nodes。

print(len(response.source_nodes))
## 40

執行Query：理解文本後回答的情況

response = query_engine.query(
    "How does idler light take photo?")
print(str(response))

Appendix：code整理

將Code整理為function的形式。

定義function

from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import SummaryIndex, VectorStoreIndex
from llama_index.core.tools import QueryEngineTool
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector
import os
import nest_asyncio

nest_asyncio.apply()
os.environ['OPENAI_API_KEY'] ="sk-XXXX" # 填入自己的API



def get_router_query_engine(file_path: str, llm = None, embed_model = None):
    """Get router query engine."""
    llm = llm or OpenAI(model="gpt-3.5-turbo")
    embed_model = embed_model or OpenAIEmbedding(model="text-embedding-ada-002")
    
    # load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)
    
    summary_index = SummaryIndex(nodes)
    vector_index = VectorStoreIndex(nodes, embed_model=embed_model)
    
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
        llm=llm
    )
    vector_query_engine = vector_index.as_query_engine(llm=llm)
    
    summary_tool = QueryEngineTool.from_defaults(
        query_engine=summary_query_engine,
        description=(
            "Useful for summarization questions related to Quantum optics" #更改為自己的敘述
        ),
    )
    
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=(
            "Useful for retrieving specific context from the Quantum optics paper."#更改為自己的敘述
        ),
    )
    
    query_engine = RouterQueryEngine(
        selector=LLMSingleSelector.from_defaults(),
        query_engine_tools=[
            summary_tool,
            vector_tool,
        ],
        verbose=True
    )
    return query_engine

執行

定義輸入PDF檔案路徑

query_engine = get_router_query_engine("pdf-quantumoptics/1908.03034v1.pdf")
#改為PDF檔案的路徑

執行詢問

response = query_engine.query("What is the summary of the document?")
print(str(response))