Crafting Intelligent Agents for Dynamic Query Resolution: Multi-Approach Workflows with Web and LinkedIn Search: LangGraph, Local LLama and LangChain
With the rapid advancements in AI, the demand for robust, intelligent question-answering (QA) systems has grown. This blog will guide you through building a flexible QA system using LangChain, LangGraph, and vector databases. Our design dynamically routes questions to relevant sources, retrieves documents, and generates accurate responses. We’ll also explore three different architectural approaches to execute these workflows, which are visualized in the image below.
Each workflow approach provides unique advantages in handling various types of queries. Let’s explore how this system was built, step-by-step. The code can be found here
Prerequisites and Installation
To set up the environment for our QA system, install the following libraries:
# Install required libraries
!pip install langgraph
!pip install langchain-nomic
!pip install -U langchain-ollama
!pip install tiktoken
!pip install langchain-community
!pip install gpt4all
These libraries enable document retrieval, vector embedding, workflow management, and interaction with pre-trained models.
Architectural Overview and Approaches
Our QA system is designed with three distinct approaches for routing, retrieving, and generating answers. The workflow diagram above illustrates these three approaches:
- Approach 1 (Left) — Dynamic Routing with Maximum Retries: This approach handles queries by first checking LinkedIn (for profile-related queries), then accessing a vector store, followed by a fallback to web search if documents are insufficient. It dynamically retries the answer generation step a set number of times to maximize response accuracy.
- Approach 2 (Center) — Streamlined Retrieval with Conditional Generation: This streamlined workflow reduces redundancy by focusing on retrieving and grading documents from the vector store, followed by generating an answer. It only falls back to web search if no useful documents are found, minimizing the number of retries and enhancing efficiency.
- Approach 3 (Right) — Minimal Retry with LinkedIn Integration: This approach focuses on retrieving documents with minimal retries, providing a straightforward path from retrieval to answer generation. It integrates LinkedIn redirection for profile queries but emphasizes a faster conclusion if no relevant information is found.
Each approach has unique benefits depending on the complexity of the query and desired response accuracy.
Step 1: Setting Environment Variables
Our system relies on several APIs, such as Tavily for web searches. Setting up these environment variables securely with getpass
helps protect sensitive information:
import os
import getpass
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")_set_env("TAVILY_API_KEY")
_set_env("LANGCHAIN_API_KEY")
Step 2: Document Retrieval with Vector Store
To retrieve relevant documents, we use a vector database setup with NomicEmbeddings
. This allows for efficient semantic search on topics relevant to the user’s question.
from langchain_community.vectorstores import SKLearnVectorStore
from langchain_nomic.embeddings import NomicEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
urls = [
"https://lilianweng.github.io/posts/2023-06-23-agent/",
"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
"https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]# Load documents and split text
docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist]
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(chunk_size=1000, chunk_overlap=200)
doc_splits = text_splitter.split_documents(docs_list)# Create vector store and retriever
vectorstore = SKLearnVectorStore.from_documents(
documents=doc_splits,
embedding=NomicEmbeddings(model="nomic-embed-text-v1.5", inference_mode="local"),
)
retriever = vectorstore.as_retriever(k=3)
Step 3: Query Routing
Our routing system directs queries to the appropriate source (vector store, web search, or LinkedIn). We’ve designed a custom routing function that identifies profile queries and redirects them to LinkedIn.
def route_question(state):
"""Route question to vectorstore, web search, or LinkedIn based on content."""
question = state["question"]
person_keywords = re.compile(r"(who is|tell me about|profile of|bio of|biography of) (.+)", re.IGNORECASE)
match = person_keywords.match(question)
if match:
state["person_name"] = match.group(2).strip()
return "linkedin_redirect"
else:
return "vectorstore"
This ensures that biography-related queries get a direct LinkedIn profile link, while other questions are handled by the vector store or web search.
Step 4: Grading Document Relevance
Each document retrieved is graded for relevance. If the retrieved documents lack relevant information, the system initiates a web search to improve the answer quality.
def grade_documents(state):
"""Check if retrieved documents are relevant."""
question = state["question"]
documents = state["documents"]
filtered_docs = []
web_search = "No"
for d in documents:
if d.page_content and question in d.page_content:
filtered_docs.append(d)
else:
web_search = "Yes"
return {"documents": filtered_docs, "web_search": web_search}
Step 5: Answer Generation
After gathering relevant documents, we generate an answer using a pre-trained model. The documents are formatted into a prompt, and the response is generated based on this context.
def generate(state):
"""Generate answer from retrieved documents."""
question = state["question"]
documents = state["documents"]
docs_txt = "\n\n".join([doc.page_content for doc in documents])
rag_prompt_formatted = f"Answer based on context:\n{docs_txt}\n\nQuestion: {question}\nAnswer:"
generation = llm.invoke([HumanMessage(content=rag_prompt_formatted)])
return {"answer": generation.content}
Step 6: Handling LinkedIn Redirects
For biography-related queries, we redirect the user to the LinkedIn profile of the person in question
def handle_linkedin_redirect(state):
"""Redirect to LinkedIn for profile-related queries."""
person_name = state["person_name"].replace(" ", "-")
linkedin_url = f"https://in.linkedin.com/in/{person_name}"
print(f"LinkedIn Profile: {linkedin_url}")
return linkedin_url
Executing the Workflow with Different Approaches
The final step is to define and execute the workflow, connecting all nodes for routing, retrieving, grading, generating, and handling redirects. Here’s the setup for each approach:
Approach 1: Dynamic Routing with Maximum Retries
This approach prioritizes dynamic retrieval and maximum retries for complex queries, optimizing for accuracy and completeness.
Approach 2: Streamlined Retrieval with Conditional Generation
In this streamlined workflow, document retrieval is prioritized with conditional generation based on document relevance, allowing for a more efficient response.
Approach 3: Minimal Retry with LinkedIn Integration
This approach minimizes retries, focusing on a straightforward retrieval-to-response pipeline with LinkedIn integration for profile queries.
Here’s the code snippet to compile the workflow graph:
from langgraph.graph import StateGraph
workflow = StateGraph(GraphState)# Define nodes for workflow
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)
workflow.add_node("linkedin_redirect", handle_linkedin_redirect)
workflow.add_node("websearch", web_search)# Build workflow with conditional routing
workflow.set_conditional_entry_point(
route_question,
{
"linkedin_redirect": "linkedin_redirect",
"vectorstore": "retrieve",
"websearch": "websearch",
}
)# Add workflow edges
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
"grade_documents",
lambda state: "generate" if state["web_search"] == "No" else "websearch",
{
"generate": "generate",
"websearch": "websearch",
},
)# Compile and display workflow graph
graph = workflow.compile()
display(Image(graph.get_graph().draw_mermaid_png()))
Testing the System
You can test the system with different types of questions to see how each approach handles them:
question = input("Please enter your question: ")
process_question(question)
The system behavior varies based on the approach used, balancing response time and accuracy for each question type.
Conclusion
This QA system demonstrates how different architectural approaches can be employed to answer various types of queries. By leveraging LangChain, LangGraph, and other NLP tools, we’ve created a robust and flexible QA framework that provides accurate, concise responses, adapting its workflow based on the type of question.
The three approaches discussed here — dynamic routing with maximum retries, streamlined retrieval with conditional generation, and minimal retry with LinkedIn integration — offer versatility for diverse use cases, from customer support to research assistance. This modular design can be further customized with additional data sources or tailored workflows to meet the specific needs of any information retrieval task.
Reference
https://langchain-ai.github.io/langgraph/tutorials/rag/langgraph_adaptive_rag_local/