Automating Chatbots: LLMs Agents with custom tools

Rossella Longo
Data Reply IT | DataTech
8 min readMay 15, 2024

Can a chatbot be developed to execute multiple task automatically based solely on user input? Yes, and it’s quite easy.

Recent advancements in Generative AI have led to a new approach in conceiving Large Language Model (LLM) applications, namely LLM agents. These agents involve LLM applications that perform complex tasks through an architecture combining LLMs with various key modules such as memory, planning and tools. The fundamental idea behind LLM agents is that an LLM serves as the “brain” controlling a flow of operations required to fulfill a user request.

In this article, we will learn how LLM-based agents operate and explore a brief pratical implementation using GPT, llama-index Tool and ReAct planning module.

What are LLM based Agents? Firstly, let’s explore their structure and functioning

An “agent” functions as an automated reasoning and decision-making mechanism within a computational framework. Its primary role involves processing user input/queries and autonomously determining the optimal course of action to generate accurate outcomes.

That means, starting from a single textual input, the agent not only interprets the text and processes a response, but also autonomously performs actions, emulating human decision-making reasoning.

The core components of an agent may include:

  • Breaking down comples queries into manageable segment
  • Choosing and configuring external tools for task execution
  • Planning task sequences for efficient completion

As depicted in the image, an LLM-based agent primarly consists of four components:

Structure

LLM-Based agents Structure
  1. User Request: refers to the input provided by the user, typically in the form of questions, commands, or requests for information which acts as the initial stimulus that triggers the agent’s response.
  2. Agent/Brain: serves as the central component of the LLM framework. It is responsible for interpreting and analyzing user requests and coordinates the interaction between different components of the framework. It continuously learns and improves its capabilities through interaction with users and access to external knowledge sources.
  3. Planning: aids the agent in determining the sequence of actions required to fulfill user requests. Involves analyzing the current context, understanding user preferences and anticipating future interactions.
  4. Memory: manages the agent’s past experiences, interactions, and knowledge. Stores relevant information such as previous user queries, responses, and outcomes.

By integrating these core components, an LLM agent framework can effectively process user inputs, generate appropriate responses, and adapt its behavior based on past experiences and future goals.

Let’s delve into the less intuitive components for a clearer understanding

What is meant by “Agents as Brain?”

“Agents as Brain” refers to the concept in which a large language model (LLM) functions as the core cognitive entity within a system. This LLM, similar to a brain, acts as the primary decision-maker and coordinator of various tasks. When triggered by a prompt template, which contains essential information about the task at hand, the LLM processes this input and decides how to proceed. It considers factors such as operational parameters, available tools and specifications for utilizing these tools effectively.

What is meant by “Planning”?

The planning module assists in breaking down the required steps or subtasks that the agent needs to address individually in response to user requests. This key process enhances the agent’s ability to analyze problems effectively and find viable solutions.

Recent advancements have introduced a mechanism enabling the model to continuously refine its execution plan based on past actions and observations. The primary goal is to identify and rectify previous errors, thereby improving overall performance.

Two notable methods for implementing this reflective mechanism are ReAct and Reflexion. ReAct is inspired by the interaction between “acting” and “reasoning”: by integrating reasoning and acting, ReAct enables an LLM to handle complex tasks through iterative steps: Thought, Action and Observation.

ReAct primarly receives environmental feedback in the form of observations, although human and model feedback may also be included. The illustration below illustrates an example of ReAct and the sequential steps involved in performing question answering:

What is meant by “Tools”?

Tools represent a range of resources that empower the LLM agent’s interaction with external environments, including but not limited to Wikipedia Search API, Code Interpreter, Apps, Math Engine, etc. This arsenal may also encompass databases, knowledge bases, and external models. As the agent engages with external tools, it executes tasks through workflows crafted to gather observations or essential information necessary for completing subtasks and satisfying user requests.

Let’s say we’re analyzing customer feedback data and need to notify a customer service representative about a specific complaint received on the DDth of month MM. To address this problem, we can develop tailored tools capable of handling various tasks. For instance, one tool could be designed to retrieve the customer’s complaint from the database, while another could automate the process of drafting and sending an email to the representative with details about the issue.

Hands-On Fraud Analysis: Automated Chatbot with Agent Support

In this brief practical implementation, we will create an LLM-based chatbot with customized tools. Starting from a single prompt, the LLM will comprehend which tools to employ and how. Specifically, not only will it be possible to interact with the chatbot, but upon request, the LLM will be capable of saving textual notes, compiling Excel files, and analyzing the data under examination.

The dataset used for this example is the following: https://www.kaggle.com/datasets/youssefismail20/fraudsynth-credit-fraud-detection-dataset. The dataset contains synthetically generated information about fraudulent transactions. Our objective will be to engage with the chatbot to perform preliminary exploratory data analysis on the provided data.

Let’s structure the work into three separate scripts:

  • Prompt.py: dedicated to the prompt template
  • Engine.py: that will contain all the customized tools
  • Main.py: which will encompass the main chatbot structure

Prompt.py contains general and standard information, that is called upon with each user query. Simply put, they are strict instructions we pass to the LLM to guide the expected output. Here’s an excerpt:

instruction_str = """\

1. Convert the query to executable Python code using Pandas.
2. The final line of code should be a Python expression that can be called with the eval() function.
3. The code should represent a solution to the query.
4. PRINT ONLY THE EXPRESSION.
5. Do not quote the expression.

"""

new_prompt = PromptTemplate(
"""\
You are working with a pandas dataframe in Python.
The name of the dataframe is 'df'.
This is the result of 'print(df.head())':
{df_str}

Follow these instructions:
{instruction_str}
Query: {query_str}
Expression: """

As depicted, we establish a template for the prompt to accompany each user query, containing instructions for converting the query into executable Python code using Pandas.

Moving forward, Engine.py serves as the project’s backbone, housing the tools utilized by the LLM to address queries, should the need arise. These tools, constructed using llamaindex’s FunctionTool, enable the conversion of any function into a tool.

Next, we define our functions: save_note, which verifies the existence of a note file and creates one if absent. Subsequently, it opens the note file and appends the provided note as an argument. Similarly, create_excel generates an Excel file and inputs data from the columns and rows variables.

At this point, we employ the “Function Tool” from the llamaindex library to convert functions into tools, as follows:

note_engine = FunctionTool.from_defaults(
fn=save_note,
name="note_saver",
description="This tool can save a text based note to a file for the user to reference later.",
)

excel_engine = FunctionTool.from_defaults(
fn=create_excel,
name="data_compiler",
description="This tool can generate an Excel file for users to store conducted analyses.",
)

Then we proceed to define the LLM, using ReAct, which allows tracking the reasoning and actions taken by the LLM.

We instantiate the Query Engine, which is the interface enabling interaction with the data, and then wrap it within the QueryEngineTool. Next, we instantiate the model, utilizing GPT 3.5 turbo.

    note_engine, 
QueryEngineTool(query_engine = fraud_query_engine, metadata=ToolMetadata(
name="fraud_data",
description="this gave information about fraud transactions and details about specified Locations",
),
),

excel_engine,
QueryEngineTool(query_engine = fraud_query_engine, metadata=ToolMetadata(
name="fraud_data",
description="this gave information about fraud transactions and details about specified Locations",
),
),
]

llm = OpenAI(model="gpt-3.5-turbo")
agent = ReActAgent.from_tools(tools, llm=llm, verbose=True, context="")

We are ready to pass our prompt to the LLM:

”Calculate the number of fraudulent users. Then, calculate the maximum amount, the minimum amount, and the average amount of fraudulent transactions grouped by Location and save the result in an excel file. Eventually Leave me a note to remind me to check the percentage of non-fraudulent users in Chicago”

The initial request involves calculating fraudulent users. As we can observe, utilizing ReAct, we can perceive the thought process undertaken by the LLM and the action it autonomously decides to pursue, in this case, extracting all occurrences of is_fraud equivalent to one, indicating fraud.

Subsequently, we request to compute the maximum amount, minimum amount, and average amount of fraudulent transactions grouped by location, and save the result in an Excel file.

In this case as well, we clearly see the thought process and action that the LLM decides to undertake. Initially, it performs the requested calculation, and then it utilizes the Excel saver tool to save the result in a new Excel file. Below, we see a portion of the generated Excel file:

Lastly, we requested to save a text note to remind us to check the percentage of non-fraudulent users in Chicago.

Just as in the other cases, we observe the thought process and action taken by the LLM, as well as its awareness of having fulfilled all user requests and thus not needing to proceed further. Below, we can see the textual note automatically generated by the LLM.

CONCLUSIONS

We’ve seen a simple and brief implementation of agents, which allow interaction with the surrounding environment, make autonomous decisions, and even learn from past experiences. The possibilities offered by agents in the landscape of artificial intelligence are vast and promising. They can be employed across a wide range of sector: from robotics to data analytics, from industrial automation to autonomous driving. Their adaptability and learning capabilities make them powerful tools for tackling complex and dynamic challenges. Ultimately, agents represent a fundamental element in the advancement of AI and promise to revolutionize multiple industries with their innovative capabilities.

--

--