LLM Study Diary: Comprehensive Review of LangChain — Part 4

Today, we’ll review the remaining chapters of Greg Kamradt’s “The LangChain Cookbook — 7 Core Concepts” to the very end!

Indexes — Structuring documents for LLMs to work with them

Document Loaders

This part of the original code worked without any changes. However, the reference links to various document loaders were broken, so I searched for them. Below is a list of currently supported document loaders.

URLs and webpages

Installation of the unstructured module was necessary.

!pip install unstructured

If you have this module installed, the original code will work.

After this, when I ran the code below, part of Paul Graham’s essay was correctly output.

from langchain.document_loaders import UnstructuredURLLoader

urls = [
"http://www.paulgraham.com/",
]

loader = UnstructuredURLLoader(urls=urls)

data = loader.load()

data[0].page_content
'New: How to Start Google | Best Essay | Superlinear Want to start a startup? Get funded by Y Combinator . © mmxxiv pg'

Retrievers

Installation of faiss is necessary. If you’re only using CPU, you can install it with:

!pip install faiss-cpu

(If you have a machine with an Nvidia GPU, you can use: !pip install faiss-gpu)

Next, I’ll change the import statement for OpenAIEmbeddings, just like I did in the blog post — Part 2.

#from langchain.embeddings import OpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings

I used invoke instead of get_relevant_documents.

# docs = retriever.get_relevant_documents("what types of things did the author want to build?")
docs = retriever.invoke("what types of things did the author want to build?")

With these two changes, the original code now works, and I was able to get responses from the LLM using the vector store.

VectorStores

This one also works just by changing the import source for the OpenAIEmbeddings class, as shown below.

#from langchain.embeddings import OpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings

Memory

This original code works as is. The link to the list of various Memory types in Langchain was broken, so I looked it up. It’s available below.

Chains

Because SimpleSequentialChain and LLMChain were deprecated, I had to change everything extensively. According to the information below, it says to use RunnableSequence instead of LLMChain, so I decided to follow this guidance.

I was at a loss because I couldn’t find a way to combine two or more RunnableSequences. When I looked at the LangChain documentation, I found a document called “Multiple chains” as shown below.

Using the sample written here as a reference, I tried combining two ChatPromptTemplates through StrOutputParser, and I was able to create a chain that works the same way as in the Cookbook.

from operator import itemgetter

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

prompt1 = ChatPromptTemplate.from_template(
"""Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
)

prompt2 = ChatPromptTemplate.from_template(
"""Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
)

model = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)

chain1 = prompt1 | model | StrOutputParser()
chain2 = {"user_meal": chain1} | prompt2 | model | StrOutputParser()

chain2.invoke({"user_location": "Paris"})

I created an object by connecting ChatPromptTemplate to model and StrOutputParser. Prompt1 is input into the model, and the output AIMessage is converted to a string by StrOutputParser. From this chain1 output, I create an input for PromptTemplate that looks like {“user_meal”: “chain1 output”}, and connect it to prompt2 → model -> StrOutputParser. When I input “Paris” as user_location into the completed Chain, the result is as follows:

‘To make Coq au Vin at home, start by browning bacon in a large pot. Remove the bacon and brown chicken pieces in the bacon fat. Add chopped onions and mushrooms, then pour in red wine and chicken broth. Simmer for about an hour until the chicken is cooked through and the sauce is thickened. Serve hot with crusty bread or over mashed potatoes. Enjoy your homemade Coq au Vin!’

The output of chain1 is “Coq au Vin”, and using that as input, chain2 explains how to make “Coq au Vin”. Coq au Vin was a dish I knew nothing about, but it seems to be rooster stewed in red wine… Since I was completely unfamiliar with the dish name, I tried inputting Osaka as the user_location next, and:

chain2.invoke({"user_location": "Osaka"})

As shown below, it gave me instructions on how to make okonomiyaki. With this, I finally felt that it was working properly.

‘To make Okonomiyaki at home, start by mixing together 1 cup of flour, 1 cup of grated yam, 2 eggs, and 1 cup of shredded cabbage in a bowl. Add in your choice of toppings such as pork, shrimp, or vegetables. Heat a griddle or non-stick pan and pour the batter onto the hot surface, shaping it into a pancake. Cook for a few minutes on each side until golden brown. Serve the Okonomiyaki with a drizzle of sweet and savory sauce, mayonnaise, and a sprinkle of bonito flakes on top. Enjoy your homemade Okonomiyaki!’

Summarization Chain

This part of the Cookbook’s code worked without any changes, so I’ll skip it.

Agents

First, I needed to install the google-search-results module.

!pip install google-search-results

I obtained the SERP_API_KEY from the link below.

I stored the key in SERP_API_KEY environmental variable.

from dotenv import load_dotenv
import os

load_dotenv()

serp_api_key=os.getenv('SERP_API_KEY', 'your_api_key')

The biggest change was that the initialize_agent function had become deprecated, which meant I had to fundamentally change how the Agent was called.

from langchain.agents import load_tools
# from langchain.agents import initialize_agent
from langchain.llms import OpenAI
import json
from langchain import hub
from langchain.agents import AgentExecutor, create_react_agent

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)
toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serp_api_key)
# agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(model, toolkit, prompt)
agent_executor = AgentExecutor(agent=agent, tools=toolkit, verbose=True)
agent_executor.invoke({"input":"what was the first album of the"
" band that Natalie Bergman is a part of?"})

Since the original code was generating an instance of a ReAct framework agent without memory called “zero-shot-react-description”, I rewrote the code referring to the information below.

I wasn’t familiar with the term ReAct, but as explained below, it seems to be an approach to agent construction that links reasoning and action, and then uses the feedback obtained from these to move on to the next reasoning step.

The prompt used in this sample is a publicly available prompt on LangSmith, and I was able to see the contents of that prompt below.

When I ran this agent in Verbose mode, I could see that a reasoning process like the one below was running.

The final answer turned out like this:

{'input': 'what was the first album of the band that Natalie Bergman is a part of?', 
'output': 'Wild Belle'}

The original question was “what was the first album of the band that Natalie Bergman is a part of?”, or in other words, “Please tell me the name of the first album of the band Natalie Bergman belongs to”. However, the final answer ended up being the name of the band she belongs to, not the album name. Looking at the logs, I can see that the agent misunderstood the problem during its reasoning process.

Mr. Kamradt also mentioned in his YouTube video that agent technology is still in development. However, looking at the logs to see how reasoning is carried out seems like it could be useful when using Langchain, so I’d like to create another opportunity to look at various agents in the future.

With this, I’ve managed to get all the recipes from Mr. Kamradt’s “The LangChain Cookbook — 7 Core Concepts” working on my end, including the last recipe.

There were times when I thought about giving up because too many things had become deprecated, but I successfully got all the samples to work.

While working on this, there were various related things I couldn’t fully investigate, so I’d like to make time to research those, explore other samples, and introduce them in another note if possible.

Thank you for reading!

Thank you for reading all the way to the end of this four-part blog series. I hope you found these posts informative and insightful. Your continued interest has been a great motivation for me to share my knowledge in AI and language technologies.

This concludes our current series, but the journey into AI and language technologies is far from over. If you have any questions about these blogs or are interested in discussing OpenAI API, LLM, or LangChain-related development projects, I’d be delighted to hear from you. Please feel free to contact me directly at:

mizutori@goldrushcomputing.com

At Goldrush Computing, we pride ourselves on our expertise as a Japanese company with native speakers. We specialize in developing prompts and RAG systems tailored for Japanese language and culture. If you’re seeking to optimize AI solutions for the Japanese market or create Japanese-language applications, we’re uniquely positioned to assist you. Don’t hesitate to reach out for collaborations or projects that require Japan-specific AI optimization.

--

--

Taka Mizutori
LLM Study Diary: A Beginner’s Path Through AI

Founder and CEO of Goldrush Computing Inc (https://goldrushcomputing.com). Keep making with Swift, Kotlin, Java, C, Obj-C, C#, Python, JS, and Assembly.