How to Build an LLM RAG Model with Custom Tools and Agents!

6 min readJan 16, 2024

In this blog post, we will have a look at how to use Langchian, a versatile library that empowers developers and researchers to create, experiment with, and analyze language models and agents, to build a retrieval-augmented generation (RAG) model. A RAG model is a type of large language model (LLM) that can retrieve relevant information from an external knowledge source before generating a response. This can improve the quality and accuracy of the LLM output, as well as provide transparency and trustworthiness to the users.

What is RAG?

Retrieval-augmented generation (RAG) is an AI framework that optimizes the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large language models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. However, LLMs have some limitations, such as:

Presenting false or outdated information when they do not have the answer or the data is stale.
Creating a response from non-authoritative or unreliable sources.
Creating inaccurate responses due to terminology confusion, wherein different training sources use the same terminology to talk about different things.

RAG extends the already powerful capabilities of LLMs to specific domains or an organization’s internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

How does RAG work?

RAG works by combining two components: a retriever and a generator. The retriever is responsible for fetching relevant documents or data from an external knowledge source, such as a database, a search engine, or a web API. The generator is the LLM that takes the input query and the retrieved documents as context and produces a natural language response. The retriever and the generator can be trained jointly or separately, depending on the task and the data availability.

How to use Langchian to build a RAG model?

Langchian is a library that simplifies the integration of powerful language models into Python/js applications. It offers a rich set of features for natural language processing (NLP), from building custom models to manipulating text data efficiently. One of the features of Langchian is the ability to create agents, which are entities that can understand and generate text and take action if programmed. These agents can be configured with specific behaviors and data sources and trained to perform various language-related tasks, making them versatile tools for a wide range of applications. Numerous well-known services and technologies, like Google, AWS, Microsoft, Hugging Face, and many more, are already integrated with Langchain. But, if you want to fully utilize the possibilities, you can design your unique tools.

One of the approaches to building an RAG model with Langchian in Python needs to use the following steps:

Importing the necessary modules from LangChain and the standard library.
Creating a chat model object called gpt4_llm using the ChatOpenAI class. This class allows you to communicate with any OpenAI model using natural language. The object takes several parameters, such as the temperature, the API key, the model name, and the verbosity level.
Defining a custom tool function using the tool decorator. This function takes a query string as input and returns a string as output. The function also has a name and a description, which are used by the LLM to decide whether to use the tool or not.
Creating a retriever tool object using the create_retriever_tool function. This function creates a tool that can search and retrieve information from your indexed documents using a retriever. The function takes three parameters: the retriever object, the name of the tool, and the description of the tool(Here I usually use the prompt template for the RAG). Note: Here I have used db.as_retriever() which is a retriever function of a Vector database.
Creating a list of tools to use with the LLM. In this case, the list contains two tools, custom_tool and retrive_tool.
Creating a prompt template object using the PromptTemplate class. This class allows you to construct and manipulate prompts for LLMs. The object takes a template string and a list of input variables as parameters. The template string defines the format and instructions for the LLM, and the input variables are placeholders for the user input. In this case, the template string tells the LLM to answer the user query, and the input variable is query.
Creating an agent object using the create_openai_tools_agent function. This function creates an agent that can use tools and prompts with an OpenAI model. The function takes three parameters: the chat model object, the list of tools, and the prompt template object.
Creating an agent executor object using the AgentExecutor class. This class allows you to run and monitor the agent. The object takes three parameters: the agent object, the list of tools, and the verbosity level.
Invoking the agent executor object with a dictionary containing the user input. The dictionary has one key, inputand one value, the user query. The agent executor object returns a response from the LLM based on the input, the tools, and the prompt.

Here is an example of the code that implements these steps:

from langchain.chat_models import ChatOpenAI
from langchain.tools import tool
from langchain.prompts import PromptTemplate
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain.tools.retriever import create_retriever_tool
import os

gpt4_llm = ChatOpenAI(temperature=0.3, openai_api_key=OPENAI_API_KEY,
                              model='gpt-4-1106-preview',
                              verbose=False)
@tool
def custom_tool(query: str) -> str:
    '''
    name = custom_tool_name
    description = "Description of the tool name"
    '''
    return 

retrive_tool = create_retriever_tool(
    db.as_retriever(),
    name="name_of_the_tool",
    description="""Here I usually use the prompt tamplate for the RAG""",
    
)
tools = [custom_tool, retrive_tool]

prompt = PromptTemplate(
    template="Answer the user query.\\n{format_instructions}\\n{",
    input_variables=["query"])

agent = create_openai_tools_agent(gpt4_llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

response = agent_executor.invoke({"input": "Your question here?"})

What are the difficulties of using RAG with custom tools?

Using RAG with custom tools can pose some challenges, such as:

Finding the right balance between the retriever and the generator. The retriever should provide enough information to the generator, but not too much that it overwhelms or distracts it. The generator should use the retrieved information wisely, but not blindly copy or paraphrase it. The retriever and the generator should work in harmony to produce a coherent and informative response.
Ensuring the quality and reliability of the external data sources. The external data sources should be authoritative, up-to-date, and relevant to the input query and the domain of the application. The data sources should also be compatible with the LLM and the tools, and provide the data in a format that can be easily processed and consumed by the agent.
Evaluating the performance and accuracy of the RAG model. Generative models are notoriously hard to evaluate with traditional metrics, such as accuracy, precision, recall, or F1-score. One new way of evaluating them is using language models themselves to do the evaluation, such as using the perplexity score or the log-likelihood ratio. However, these metrics are not perfect and may not capture the nuances and subtleties of natural language generation.

Conclusion

In this blog post, I have shown you how to use Langchian to build a RAG model that can retrieve relevant information from an external knowledge source before generating a response. I have also discussed some of the difficulties and challenges of using RAG with custom tools. I hope you have found this post useful and informative. If you have any questions or feedback, please feel free to leave a comment below. Thank you for reading! 😃