How langchain agent works internally (Trace by using LangSmith)

Terry Cho
7 min readFeb 5, 2024

--

This time, by explaining how to use create_react_agent, we will take a detailed look at how the agent operates internally, and also learn how to monitor and track the agent’s internal operations using the langsmith monitoring tool.

Registered Langsmith

langsmith is an online-based LLM application monitoring, testing support, and deployment support tool created by langchain. Detailed information will be explained in a separate chapter, and this chapter will briefly look at the monitoring trace function. You can monitor in detail how branching occurs in the Agent and what structure the Agent makes decisions.

To use Langsmith, you must sign up for the site.https://www.langchain.com/langsmith

After signing up and entering the dashboard, there is a Project menu on the left menu. Enter the menu and create a new project with “New Project” on the right.

<Figure. Projects screen>

When creating a project, enter the project name and description.

<Figure. Project creation screen>

Once the project is created, create an API KEY from the bottom left menu as shown below. This API KEY is used in the LLM application and allows the LLM application to send various metric information to LangSmith for monitoring.

<Figure. API key creation screen >

Example code

Let’s look at the example code. The first part of the example code is almost identical to the previous example. Set LANGCHAIN-related environment variables using os.environ.

LANGCHAIN_API_KEY uses the APIKEY created earlier in the langsmith console, and LANGCHAIN_PROJECT uses the project name of langsmith created earlier.

from langchain.llms.openai import OpenAI
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain_core.prompts import PromptTemplate
from langchain.agents import AgentExecutor, create_react_agent
import os
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"]="{YOUE_LANGSMITH_APIKEY}"
os.environ["LANGCHAIN_PROJECT"]="{YOUR_LANSMITH_PROJECTNAME}"
os.environ["OPENAI_API_KEY"] = "{YOUR_OPENAI_KEY}"
os.environ["SERPER_API_KEY"] = "{YOUR_SERPER_APIKEY}"
model = OpenAI()google_search = GoogleSerperAPIWrapper()
tools = [
Tool(
name="Intermediate Answer",
func=google_search.run,
description="useful for when you need to ask with search",
verbose=True
)
]
template = '''Answer the following questions as best you can. You have access to the following tools:{tools}Use the following format:Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!Question: {input}
Thought:{agent_scratchpad}'''
prompt = PromptTemplate.from_template(template)search_agent = create_react_agent(model,tools,prompt)
agent_executor = AgentExecutor(
agent=search_agent,
tools=tools,
verbose=True,
return_intermediate_steps=True,
)
response = agent_executor.invoke({"input": "Where is the hometown of the 2007 US PGA championship winner and his score?"})
print(response)

prompt

Unlike the initializ_agent method in previous versions, create_react_agent requires you to directly define the prompt to be used by the Agent. Sample prompts for Agents can be obtained from the Langchain page.

https://api.python.langchain.com/en/latest/agents/langchain.agents.react.agent.create_react_agent.html

template = '''Answer the following questions as best you can. You have access to the following tools:
{tools}Use the following format:Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!Question: {input}
Thought:{agent_scratchpad}'''

The variables that must be passed as arguments to this prompt are as follows.

  • tools: A description of the tool, including a description of the tool, input variables for each tool, and their description.
  • input: question input by agent
  • tools_name: Names of tools that the agent can use
  • agent_scratchpad: The agent does not call a tool just once to get the desired answer, but has a structure that calls tools repeatedly until the desired answer is obtained. Each time you call a tool, what the previous call was like, information about the previous call, and the result are stored in this field.

The variable values ​​in the prompt are filled in automatically, so there is no place to modify them. As for the prompt, you can also use the example that has already been written. If you just want to improve the agent’s performance, you can modify this prompt.

The prompt guides the agent’s operation mechanism.

Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)

At the Thought stage, the agent uses llm to think about what action to take to get the answer to the question, and Action decides which tool to use. And Action Input creates a new question queried by the tool. And finally, Observation saves the results in the tool called by Action.

Through this sequence, the Agent executes [Thought → Action → Action Input → Observation], saves the result in agent_scratchpad, and repeats this process until the desired answer is obtained.

Once you have created a prompt, create an agent using this prompt, and create and call an agent executor using this agent.

If an agent makes a decision about an incoming question and executes a tool, a subject is needed to repeatedly run the agent through this process until an answer is obtained. The agent executor plays this role.

Agent execution detailed structure

Let’s diagram this execution structure graphically.

<Figure. Agent detailed operation principle>

When a question comes in, insert this question into the agent prompt created earlier and deliver it to the agent.

To answer a question, the agent refers to the guide in the agent prompt and creates the necessary question (corresponds to Question 1: Action Input) through the thought process. After receiving the answer (corresponding to Answer 1: Observation) by calling the Tool (corresponding to Action) that selected the question, append the answer to agent_scratchpad of the existing agent prompt (Agent prompt)

The agent executor passes this agent prompt back to the agent, creates question 2 in the same order, receives the answer answer2, and similarly appends it to the existing agent prompt.

When agent_executor passes this prompt to the agent again, if enough information necessary for the answer has been collected, the agent generates an answer based on the information in the prompt and returns the final answer.

Understanding the Call Process with LangSmith

Once you understand the concept of the operating principle, let’s look at the process through which the agent arrived at the answer in this example through langsmith.

After accessing the langsmith console, go to the project menu on the left.

<Figure. LangSmith’s Projects menu>

Select the Project created earlier from the menu.

After entering the Project, select the Traces menu as shown below to view the records of calling Agents. If you click on the call selected during the test process, you can see all the contents of the AgentExecutor call, as shown on the right in the picture below.

<Figure. Trace> in the Project menu

Let’s look at the first OpenAI call from Trace.

<Figure. Details of the first OpenAI LLM call in Traces>

If you look at the right side, the {tools} section defines the Google Search we use with the name “Intermediate Answer” and defines the purpose of this tool.

<Figure. Some details on the prompt >

Looking at the bottom of the prompt, the Agent will think “Find the 2007 US PGA winner” through the Thought process to find the answer to this question. It will use the search engine tool “Intermediate Answer” as the Action, and use the search keyword for Action. You can see that “2007 US PGA championship winner” will be used.

To check what information has been provided for this question in the left menu, check the Intermediate Answer as follows.

<Image Intermediate Answer>

We can confirm that the winner in 2007 was Tiger Woods.

Some of the prompts from the OpenAI staff are shown below.

<Image 2nd Open AI call prompt>

If you look at the square above, you can see the contents of the Open AI call just before it. What thought process went through and which tools were used are defined in Action, the search keywords to be used in Action are defined in Action Input, and the search results are contained in Observation.

The square below shows that, based on previous calls, Thought thinks it needs to find “Tiger Woods’ hometown” and will use the Intermediate Answer tool to do so.

Let’s look at the final OpenAI call.

<Image of openAI call by Final>

As shown in the picture above, a total of 3 actions are called, and you can see that [Thought, Action, Action Input, Observation] for the call is included. In the final final call, you can confirm that the Agent has found the desired answer and created the Final Answer.

--

--