Building an Ayurveda Healthcare Multi-PDF Agent with SingleStore and LlamaIndex

9 min readJun 21, 2024

Introduction

With the advent of AI agents and RAG, the healthcare industry has undergone significant transformation. This transformation impacts various areas, including patient demographics, medical history, treatment recommendations, drug development, and more. Imagine how fascinating it would be to have an AI agent capable of providing information on proper diet, yoga, herbal remedies, and medicines. The adage Prevention is better than Cure aligns well with the principles of Ayurveda, which offers comprehensive guidance on maintaining a healthy lifestyle.

Developing AI agents and RAG applications with a single PDF document is easy, but we encounter some challenges when dealing with Multi-PDFs. To address this, we have implemented a query pipeline that optimizes retrieval using HyDE. Before delving deeper into creating an Ayurveda Healthcare Multi-PDF agent, let’s understand what HyDE is.

An Overview of HyDE

HyDE is a retrieval method that uses large language models to create hypothetical documents based on user queries. These documents aim to capture the essence and context of the query. Once the hypothetical documents are generated, HyDE encodes them into dense vectors using a contrastive encoder. This encoding process captures the semantic similarities between the hypothetical documents and the actual documents in the retrieval corpus. The encoded vectors of the hypothetical documents are then compared with the vectors of real documents in the corpus. The documents that show the highest similarity to the hypothetical representation of the query are retrieved and presented to the user.

Source: Precise Zero-Shot Dense Retrieval without Relevance Labels

HyDE has demonstrated outstanding performance in various retrieval tasks such as web search, low-resource retrieval, and multilingual retrieval.

Implementing Query Pipelines with HyDE on SingleStore Helios

We will utilize SingleStore Helios, which provides a shared edition including a complimentary starter workspace for developing our production apps for free. Sign up for the free shared tier to leverage the starter workspace.

The starter workspace contains one attached database. In SingleStore Helios, we can utilize Notebooks and Secrets without installing the SingleStore library. We can secure outbound access to our notebooks using the Firewall by adding the external endpoint.

Let’s code!

To get started, create a notebook in Notebooks under the Develop page in the cloud portal (SingleStore Helios). Install and import the required dependencies.

!pip install -q llama-index
!pip install -q pyvis
!pip install -q llama-index-embeddings-huggingface
!pip install -q llama-index-llms-openai
!pip install -q -U llama-index-vector-stores-singlestoredb

import logging
import sys
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.node_parser.text import SentenceSplitter
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.query_engine import TransformQueryEngine
from IPython.display import Markdown, display
from llama_index.vector_stores.singlestoredb import SingleStoreVectorStore
from llama_index.core.tools import QueryEngineTool, ToolMetadata, BaseTool
from llama_index.core.settings import Settings
from llama_index.core.callbacks import CallbackManager
from llama_index.core.agent.react.types import (
    ActionReasoningStep,
    ObservationReasoningStep,
    ResponseReasoningStep,
)
from llama_index.core.agent import Task, AgentChatResponse, ReActChatFormatter, QueryPipelineAgentWorker
from llama_index.core.query_pipeline import (
    AgentInputComponent,
    AgentFnComponent,
    CustomAgentComponent,
    QueryComponent,
    ToolRunnerComponent,
    InputComponent,
    Link,
    QueryPipeline 
)
from llama_index.core.llms import MessageRole, ChatMessage, ChatResponse
from typing import Dict, Any, Optional, Tuple, List, Set, cast
from singlestoredb.management import get_secret
from sqlalchemy import *
from pyvis.network import Network
from IPython.display import display, HTML

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

callback_manager = CallbackManager()
Settings.callback_manager = callback_manager

To create an Ayurveda Healthcare Multi-PDF agent, I gathered 40 Ayurveda ebooks and utilized my GitHub repository to import the data into the SingleStore Notebooks.

!git clone <link_to_github_repo>

To work with SingleStore, the first step is to get the connection information. The connection information consists of the username, password, database name, host, and port, which can be obtained from any of the clients while connecting to the starter workspace.

I accessed this information from the SQL IDE.

We can save that information in Secrets. Along with this information, we should also save the OpenAI API key, as we are going to utilize the OpenAI model GPT-4.

Let’s retrieve the secrets.

user = get_secret('user')
password = get_secret('password')
host = get_secret('host')
database = get_secret('database')
openai = get_secret('openai')

Now, we will initiate the vector store using the information saved in the Secrets.

vector_store = SingleStoreVectorStore(
    table_name="embeddings",
    content_field="content",
    metadata_field="metadata",
    vector_field="vector",
    host = host,
    port = "3333",
    database = database,
    user = user,
    password = password,
)

We will store the vector in the storage context because we will use it to create the vector index.

storage_context = StorageContext.from_defaults(vector_store=vector_store)

Next, we will create embeddings, which will be required to create the vector store index.

embed_model = HuggingFaceEmbedding()

Settings.embed_model = embed_model

As we have 40 PDFs, we will load them and generate indices for each in a dictionary format. The indices will be generated after creating a vector store index. The vector store index will use nodes, storage context, and the sentence splitter as transformations.



pdf_directory = "./AyurvedicData"

pdf_files = [file for file in os.listdir(pdf_directory) if file.endswith('.pdf')]

storage_base_dir = "./storage"
Settings.transformations = [SentenceSplitter(chunk_size=1024)]

def generate_storage_dir(file_name):
    base_name = os.path.splitext(file_name)[0]  # Remove extension
    safe_name = base_name.replace(' ', '_').replace('&', 'and')
    return os.path.join(storage_base_dir, safe_name)

all_indices_loaded = True

indices = {}

for pdf_file in pdf_files:
    try:
        storage_dir = generate_storage_dir(pdf_file)
        storage_context = StorageContext.from_defaults(persist_dir=storage_dir)
        index = load_index_from_storage(storage_context)
        indices[pdf_file] = index
    except Exception as e:
        print(f"Failed to load index for {pdf_file}: {e}")
        all_indices_loaded = False

if not all_indices_loaded:
    for pdf_file in pdf_files:
        if pdf_file not in indices:  
            try:
                input_path = os.path.join(pdf_directory, pdf_file)
                docs = SimpleDirectoryReader(input_files=[input_path]).load_data()
                
                text_parser = SentenceSplitter(chunk_size=1024)
                text_chunks = []
                doc_idxs = []

                for doc_idx, doc in enumerate(docs):
                    cur_text_chunks = text_parser.split_text(doc.text)
                    text_chunks.extend(cur_text_chunks)
                    doc_idxs.extend([doc_idx] * len(cur_text_chunks))

                from llama_index.core.schema import TextNode
                nodes = []
                for idx, text_chunk in enumerate(text_chunks):
                    node = TextNode(text=text_chunk)
                    src_doc = docs[doc_idxs[idx]]
                    node.metadata = src_doc.metadata
                    nodes.append(node)
                
                vectorstoreindex = VectorStoreIndex(nodes = nodes, storage_context=storage_context,transformations=Settings.transformations,)
                index = vectorstoreindex.from_documents(docs)
                
                storage_dir = generate_storage_dir(pdf_file)
                index.storage_context.persist(persist_dir=storage_dir)
                
                indices[pdf_file] = index
            except Exception as e:
                print(f"Failed to process {pdf_file}: {e}")

Now, the indices variable contains all the loaded or newly created indices. We will initialize LLM after creating the indices. In this process, we are utilizing OpenAI's GPT-3.5.

llm = OpenAI(model="gpt-3.5-turbo")
Settings.llm = llm

For each item in indices, we will create HyDE query engines.

query_engines={}

for pdf_name, index in indices.items():
    try:
        query_engine = index.as_query_engine(similarity_top_k=3)
        
        hyde = HyDEQueryTransform(include_original=True)
        
        hyde_query_engine = TransformQueryEngine(query_engine, hyde)
        
        query_engines[pdf_name] = hyde_query_engine
        
        print(f"HyDE query engine created for {pdf_name}")

    except Exception as e:
        print(f"An error occurred while creating query engine for {pdf_name}: {e}")

For each index, we will instantiate a QueryEngineTool using HyDE query engines.


query_engine_tools = []

for pdf_name, hyde_query_engine in query_engines.items():
    try:
        description = "Provides information about improving lifestyle, diet, and medicinal benefits using Ayurveda. Use a detailed plain text question as input to the tool."
        
        metadata = ToolMetadata(
            name=pdf_name.replace(".pdf", "").replace(" ", "_").lower(),
            description=description
        )
        
        tool = QueryEngineTool(
            query_engine=hyde_query_engine,
            metadata=metadata
        )
        
        query_engine_tools.append(tool)
        
        print(f"QueryEngineTool created for {pdf_name}")
    
    except Exception as e:
        print(f"An error occurred while creating QueryEngineTool for {pdf_name}: {e}")

We will now initiate the agent input component.

def agent_input_fn(task: Task, state: Dict[str, Any]) -> Dict[str, Any]:
    """Agent input function.

    Returns:
        A Dictionary of output keys and values. If you are specifying
        src_key when defining links between this component and other
        components, make sure the src_key matches the specified output_key.

    """
    if "current_reasoning" not in state:
        state["current_reasoning"] = []
    reasoning_step = ObservationReasoningStep(observation=task.input)
    state["current_reasoning"].append(reasoning_step)
    return {"input": task.input}


agent_input_component = AgentInputComponent(fn=agent_input_fn)

We defined the agent component that is responsible for creating a ReAct prompt. After the LLM generates the output, it is parsed into a structured object. When input is received, the LLM is called upon with the ReAct agent prompt. Using the Chain-Of-Thought + Acting method we will define the ReAct Prompt component.

def react_prompt_fn(
    task: Task, state: Dict[str, Any], input: str, tools: List[BaseTool]
) -> List[ChatMessage]:
    chat_formatter = ReActChatFormatter()
    return chat_formatter.format(
        tools,
        chat_history=task.memory.get() + state["memory"].get_all(),
        current_reasoning=state["current_reasoning"],
    )


react_prompt_component = AgentFnComponent(
    fn=react_prompt_fn, partial_dict={"tools": query_engine_tools}
)

After the LLM produces an output, we follow a decision tree process:

If an answer is generated, we directly process the output.
However, if an action is specified, we execute the designated tool with the specified arguments and then process the resulting output.

To process the agent’s response, we will define some functions which will ultimately follow the decision tree process.

def parse_react_output_fn(
    task: Task, state: Dict[str, Any], chat_response: ChatResponse
):
    """Parse ReAct output into a reasoning step."""
    output_parser = ReActOutputParser()
    reasoning_step = output_parser.parse(chat_response.message.content)
    return {"done": reasoning_step.is_done, "reasoning_step": reasoning_step}


parse_react_output = AgentFnComponent(fn=parse_react_output_fn)


def run_tool_fn(
    task: Task, state: Dict[str, Any], reasoning_step: ActionReasoningStep
):
    """Run tool and process tool output."""
    tool_runner_component = ToolRunnerComponent(
        query_engine_tools, callback_manager=task.callback_manager
    )
    tool_output = tool_runner_component.run_component(
        tool_name=reasoning_step.action,
        tool_input=reasoning_step.action_input,
    )
    observation_step = ObservationReasoningStep(observation=str(tool_output))
    state["current_reasoning"].append(observation_step)

    return {"response_str": observation_step.get_content(), "is_done": False}


run_tool = AgentFnComponent(fn=run_tool_fn)


def process_response_fn(
    task: Task, state: Dict[str, Any], response_step: ResponseReasoningStep
):
    """Process response."""
    state["current_reasoning"].append(response_step)
    response_str = response_step.response
    state["memory"].put(ChatMessage(content=task.input, role=MessageRole.USER))
    state["memory"].put(
        ChatMessage(content=response_str, role=MessageRole.ASSISTANT)
    )

    return {"response_str": response_str, "is_done": True}


process_response = AgentFnComponent(fn=process_response_fn)


def process_agent_response_fn(
    task: Task, state: Dict[str, Any], response_dict: dict
):
    """Process agent response."""
    return (
        AgentChatResponse(response_dict["response_str"]),
        response_dict["is_done"],
    )


process_agent_response = AgentFnComponent(fn=process_agent_response_fn)

Now, we will create a query pipeline, where we will add the created components as modules. We will add a chain and links to the query pipeline.

qp = QueryPipeline(verbose=True)

qp.add_modules(
    {
        "agent_input": agent_input_component,
        "react_prompt": react_prompt_component,
        "llm": llm,
        "react_output_parser": parse_react_output,
        "run_tool": run_tool,
        "process_response": process_response,
        "process_agent_response": process_agent_response,
    }
)

qp.add_chain(["agent_input", "react_prompt", "llm", "react_output_parser"])

qp.add_link(
    "react_output_parser",
    "run_tool",
    condition_fn=lambda x: not x["done"],
    input_fn=lambda x: x["reasoning_step"],
)
qp.add_link(
    "react_output_parser",
    "process_response",
    condition_fn=lambda x: x["done"],
    input_fn=lambda x: x["reasoning_step"],
)

qp.add_link("process_response", "process_agent_response")
qp.add_link("run_tool", "process_agent_response")

To visualize the query pipeline, we will use Network from the Pyvis library.

net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(qp.clean_dag)
print(net)

The following will be the resultant pipeline.

{
    "Nodes": [
        "agent_input",
        "react_prompt",
        "llm",
        "react_output_parser",
        "run_tool",
        "process_response",
        "process_agent_response"
    ],
    "Edges": [
        {
            "src_key": null,
            "dest_key": null,
            "condition_fn": null,
            "input_fn": null,
            "width": 1,
            "from": "agent_input",
            "to": "react_prompt",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "condition_fn": null,
            "input_fn": null,
            "width": 1,
            "from": "react_prompt",
            "to": "llm",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "condition_fn": null,
            "input_fn": null,
            "width": 1,
            "from": "llm",
            "to": "react_output_parser",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "width": 1,
            "from": "react_output_parser",
            "to": "run_tool",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "width": 1,
            "from": "react_output_parser",
            "to": "process_response",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "condition_fn": null,
            "input_fn": null,
            "width": 1,
            "from": "run_tool",
            "to": "process_agent_response",
            "arrows": "to"
        },
        {
            "src_key": null,
            "dest_key": null,
            "condition_fn": null,
            "input_fn": null,
            "width": 1,
            "from": "process_response",
            "to": "process_agent_response",
            "arrows": "to"
        }
    ],
    "Height": "600px",
    "Width": "100%",
    "Heading": ""
}

We will save the network as an HTML file and display the content.

net.write_html("agent.html")

with open("agent.html", "r") as file:
    html_content = file.read()

display(HTML(html_content))

The next step is to set up the agent worker around the query engines.

agent_worker = QueryPipelineAgentWorker(qp)
agent = agent_worker.as_agent(
    callback_manager=CallbackManager([]), verbose=True
)

We will create the task with the agent and ask questions.


task = agent.create_task(
    "What are ayurvedic treatments and prevention of migraine?"
)

step_output = agent.run_step(task.task_id)
print(step_output)

The agent will give the following answer:

In Ayurveda, the treatment and prevention of migraines focus on restoring balance among the body’s doshas, particularly Vata and Kapha. Preventative measures include avoiding daytime sleep, protecting oneself from cold, especially in the mornings, and managing allergic conditions promptly. It also emphasizes dietary restrictions, such as avoiding yogurt, greasy foods, and the combination of milk with acidic foods or fish. Treatment involves the use of herbal remedies to dilate capillaries, reduce fluid exudation, and promote fluid filtration. Key herbs and compounds, such as Guggulu for capillary dilation, Vachadi for drying exudation, and Pippalishrita for inducing cleansing sneezing, are employed. Additionally, remedies like Shirashularivajrarasa and Sadbindutaila are used to restore the coordination between the nervous, arterial, and venous systems, essential for relieving and preventing migraine attacks.

Let’s try another question.

task = agent.create_task(
    "What dosha does the person have if his skin is dry and thin, and he also has joint pains?"
)

step_output = agent.run_step(task.task_id)
print(step_output)

The agent will give the following answer:

If a person has dry and thin skin along with joint pains, it indicates an imbalance in the Vata dosha. Vata governs all movement and is associated with dryness and coldness in the body. When Vata is out of balance, it can lead to symptoms such as dry, rough skin and discomfort in the joints, manifesting as pain or stiffness. These characteristics are typical signs of elevated Vata that require balancing to restore harmony in the body and alleviate the symptoms.

Conclusion

I found it interesting to use the SingleStore vector database within SingleStore Notebooks to build an Ayurvedic healthcare multi-PDF agent. The free shared tier is easy to use and we saw that it can handle 40 ebooks. It is also easy to transition this application to a production environment with the SingleStore free shared tier.

Thanks for reading!

References

Agent around query pipeline with HyDE for PDFs

LlamaIndex with SingleStoreDB