Tracing How LangChain Works Behind the Scenes

Kelvin Lu
7 min readJun 26, 2023

--

Have you ever wondered how LangChain works its magic? When you ask LangChain a question, it seems to just know the answer. But how does it do that? Is the answer trustworthy, or is it just a hallucination? How to tell if a result results from prudent reasoning or a devilish joke?

René Descartes

Four hundred years ago, French mathematician and philosopher René Descartes worried that if the foundations of knowledge were not completely solid, anything built upon them would inevitably collapse. He thus decided that if there was reason to doubt the truth of something, no matter how slim the doubt, then it should be discarded as false. Descartes’s method of doubt was radical, but it led him to a famous conclusion: “I think, therefore I am.” The only thing that Descartes could be certain of was his own existence, because even if he doubted everything else, he still had to exist in order to do the doubting.

The old wisdom of Descartes’s method of doubt is still relevant in the era of generative AI (GAI). We know that GAI can hallucinate, and we don’t want to fully trust the machine. Therefore, we should be skeptical of GAI’s thoughts and outputs, and we should verify its information. By doing so, we can ensure that we are using GAI responsibly and safely.

Tracing is a way to see the inner workings of a GAI model. By tracing LangChain, you can see how it arrives at its answers. This will help you understand how the model works and trust the answers it gives you. In this post, I’ll show you how to trace LangChain in two different ways.

The Baseline Test Case

We have designed a simple function that can solve a complex problem: how long does it take for the fastest golf ball to travel around the Earth?

The function is composed of two parts: the SERP API and LLM_Math. LLM_Math is a wrapper of ChatGPT that can do simple math calculations, while the SERP API is an interface that allows Google search.

When the function is asked a question, it first tries to find the answer in its knowledge base. If it can’t find the answer, it then uses the SERP API to search Google for supporting information. If it finds the information on Google, it then uses LLM_Math to calculate the final answer.

Here is a code snippet that shows how the function works:

from langchain.agents import initialize_agent, load_tools
from langchain.llms import OpenAI


def search_agent_demo():
llm = OpenAI(temperature=0)

tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
tools, llm, agent="zero-shot-react-description", verbose=True
)
return agent.run(
"How long does the fastest golf ball travel around the Earth?\n"
" - Do you know the speed of the fastest golf ball? Find it out.\n"
" - You also need to figure out the circumference of the Earth as well.\n"
" - Final answer should be in hours."
)

search_agent_demo()

The code requires calling the OpenAI and SERP APIs. To run the code, you will need to set up two environment variables: OPENAI_API_KEY and SERPAPI_API_KEY. You can generate your own API keys on the OpenAI: https://beta.openai.com/account/api-keys and SERPAPI: https://serpapi.com/ websites.

The following is an example of the output of the code:

Running result

We turned on the verbose agent run mode to see the steps of the agent’s thoughts. The output showed that the agent first checked its knowledge base to see if it knew the speed of the fastest golf ball and the circumference of the Earth. When it didn’t find the answer, it used the SERP API to search Google for the information. It then calculated the answer and outputted it.

The entire process looked solid, and the results looked good. The agent could find the information it needed and calculate the answer correctly. However, not all experiments will run as smoothly as this one.

In some cases, the agent may stray from the correct path. In these cases, the verbose mode may not produce enough information to help us understand why the agent went astray. That’s why we need tracing to zoom in and see what the agent is doing at each step.

Tracing with LangChain Built-in Function

One way is to use the LangChain’s native tracing support. LangChain, the very popular and rapidly developing framework, does support tracing. However, the document of tracing is incomplete. I hope the following guidelines can save you some time.

Step 0: prepare dependencies

LangChain tracing uses a web-server to collect agent-run information. The web-server uses port number 8000 to collect trace information and port number 4173 to serve the user interface. The web-server runs as a Docker container. Before we go any further, please make sure the langchain has been installed, Docker has been installed, and the command docker-compose is executable.

Step 1: start the tracing server

Once the environment is ready, just run the following command to start the tracing server:

python -m langchain.server

This command will build a langchain server container and spin it up.

Running LangChain Server

Step 2: enable tracing and run the payload

There are two ways to enable tracing. The first option is to use the environment variable:

os.environ[“LANGCHAIN_TRACING”] = “true”

The second way is to not use the environment variable but to apply with-statement to your code snippet. In the tracing_enabled() function call, you can optionally set the session name:

from langchain.callbacks import tracing_enabled

with tracing_enabled(‘sess_test_tracing’) as session:
assert session
search_agent_demo()

Step 3: inspect the trace information

If we apply either of the options and re-run the code, we can opent the webui from the address:

http://localhost:4173/sessions

The Webui looks as the following:

When we invoked the function tracing_enabled(), we optionally provided a session name: sess_test_tracing, otherwise, all the tracing information went into the default session. The tracing server doesn’t automatically detect session names. We have to manually create a new session named ‘sess_test_tracing’ to match the session name in the code.

When we open the session, we can find the tracing details:

If we click on the ‘Explore’ buttons, we can find all the prompt information for each interaction. For example:

Answer the following questions as best you can. You have access to the following tools:

Search: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.
Calculator: Useful for when you need to answer questions about math.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: How long does the fastest golf ball travel around the Earth?
- Do you know the speed of the fastest golf ball? Find it out.
- You also need to find out the circumference of the Earth as well.
- Final answer should be in hours.
Thought:

Without the help of tracing, we would have to look into the source code to know these details.

Tracing with LangChain Visualizer

Besides the LangChain tracing, I would also recommend an alternative — LangChain Visualizer. The reasons are that the LangChain tracing requires running in Docker, and that the UI functions are not well refined.

Step 0: installation

pip install langchain-visualizer

Step 1: change the code

LangChain visualizer requires the payload to be wrapped in an async function.

# from langchain_visualizer import visualize # use this when not run in Jupyter
from langchain_visualizer.jupyter import visualize
from langchain.agents import initialize_agent, load_tools
from langchain.llms import OpenAI


async def search_agent_demo():
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
tools, llm, agent="zero-shot-react-description", verbose=True
)
return agent.run(
"How long does a plane travel around the Earth at sonic speed?"
" - Remember the unit of circumference of the Earth is in kilometer while the unit of sound speed is meter per second."
" - You also need to convert the kilometers into meters in your calculation and the final answer should be in hours."
" - Do your reasoning independently."
)

# you don't have to specify width and height
# but if you do, you can change the size of the rendered window
visualize(search_agent_demo, width=1000, height=500)

The above code snippet runs in Jupyter Notebook. The result of the code run is the following: You can choose to open the link in the web browser to access the same outcome.

I found LangChain Visualizer to be more elegant, and I liked the cost estimation it produced for each call.

Conclusion

Generative AI (GAI) is still in its early stages of development. However, I believe that we will see more and more sophisticated applications of GAI in the near future. As these applications become more complex, it will become increasingly important for us to understand how they work.

In addition to debugging, tracing can be used to improve the performance of GAI systems. By identifying areas where the system can be improved, we can make it more efficient and effective.

More importantly, tracing can also be used to build trust with users. By showing users how the system works, we can help them understand that the system is not making decisions in a black box.

In conclusion, tracing is a valuable tool for engineers who are working with GAI systems. I hope you find spending time on it worthwhile.

--

--