LangChain🦜🔗 … Unleash true power of LLMs!

Build chained and Agent based LLM apps, with few lines of code 🚀

Momal Ijaz
AIGuys
7 min readJun 21, 2023

--

Generated by Imagine.AI stable diffusion. Prompt: “A cute little green parrot sitting on a big metallic chain”

LangChain 🦜🔗 is an open-source framework used to build LLM-based applications effortlessly. The idea of building an LLM application seems pretty easier at first sight… all you need is a paid API subscription to a big tech giant with a good LLM like OpenAI’s GPT-series, Anthropic’s Claude-series or Google’s Bard API( still in beta, and you gotta wait in a waitlist though!) and a really good prompt… but building an LLM application can go way beyond that!

I don’t need a “tool” for making my LLM app 😒

📚LLMs are usually thought of as knowledge house that has seen tons and tons of data on the internet and has been student-to-domain specialist (humans in the training loop), and can answer almost any question you have with a certain level of correctness.. but most of the times the application that you want to build might require more than just a generic overview of all domains. Say if you want to build an LLM that can answer people’s questions about immigration and visas for the US, you certainly:

a. Don’t want your app to give incorrect information to users. (aka avoid hallucinations) 🤷‍♀️

b. You want to ensure your LLM knows and understands all the laws perfectly. 🤔

c. You want it to handle follow-up questions correctly for each user (i.e. build and maintain separate contexts for each user separately) 🤖

d. You might need access to external data sources like user profiles to allow LLM to make a completely informed decision (connecting LLM to external data sources). 🏭

e. You also might want your LLM to make decisions about what rule book it needs to look into based on the user’s query…( decision making for the usage of external tools.. Agents!) 👨🏻‍🏫

These concerns give us a nice high-level view of how many aspects we need to look into for building an LLM application and it definitely goes beyond a good prompt and an API key.

That’s exactly where LangChain🦜🔗 comes into picture!

LangChain 🦜🔗 allows you to build your LLM applications with chained outputs, smart prompts, elegant context building, and seem-less integrations with external data sources and tools. If you feel like you can tackle all these scenarios on your own, with a lot more control over your app, you can do so, but for beginners, LangChain is definitely worth a try to give you an insight into the capabilities of how powerful LLMs can get!

What LangChain🦜🔗 can do for me? 👀

Let’s dive into each of the core features, one by one.

1. Model, Prompt, and Parsers🦾

LangChain allows you to choose a specific type of model, create prompt templates and parse the response of your model in the desired format. This could be done manually with OpenAI’s API but with LangChain, we can do all that with just a few lines of code.

LangChain🦜🔗 supports three different types of Models:

a. Language Model: Takes textual prompt as input and returns a response in the text string.

b. Chat Model: These models take a list of chat messages as input and return the next response in the chat flow.

c. Text Embedding Model: These models are used to embed a piece of text. They take a text string as input and return a list of floats (embeddings).

LangChain🦜🔗 makes prompt creation easier by providing Prompt-Template class. This is a pretty useful feature as, for sophisticated applications, the prompts can go longer and prompt-templates make prompt reusability easier for us. In addition, there are also off-the-shelf common prompts available in LangChain🦜🔗, e.g. for connecting with SQL/remote instances, for QnA or summarisation, etc.

Finally, the output parsers are used to extract the structured response from the textual output of an LLM. Output parsers use LLMs to format the output response into the desired structure and do error handling and retrying if the format is not as expected.

2. Memory 🧠

Memory is the data that you want your LLM to use to answer your query. It can be :

a. Context of earlier conversations you have had with LLM.

b. Any data that you want it to retrieve from the external data source.

LLMs are state-less, every transaction or API-call made is independent of the previous one. Efficient context memorization or memory management is really important for LLM-based applications.

We will talk about the first scenario here. LangChain provides many options for Memory buffers to remember the context of the transactions done with LLM so far. A few memory types are listed below:

  1. ConversationBufferMemory: This type of memory can be passed into the ConversationChain object, along with an LLM, and then using conversation.predict function, the LangChain will append all the previous transactions(prompt/response pairs) in the follow-up questions you ask from the llm.
from langchain.chat_models import chatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory)
conversation.predict("Hi I am Momal")
#Output: "Hey Momal! I am AI, How may I help you?"
conversation.predict("What is 3+4?")
#Output: "It's 7"
conversation.predict("What's my name?")
#Output: "Your name is Momal, As you said earlier"

2. ConversationBufferWindowMemory: This type of memory allows you to specify a window size, that is, the number of last k transactions that you want to pass as context to your LLM, to optimize the number of tokens passed to the model, as this is directly related to the API cost.

from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1) #Remember just last transaction
conversation = ConversationChain(llm=llm, memory=memory)
conversation.predict("Hi I am Momal")
#Output: "Hey Momal! I am AI, How may I help you?"
conversation.predict("What is 3+4?")
#Output: "It's 7"
conversation.predict("What's my name?")
#Output: "I am an AI, I don't have access to that info"

3. ConversationTokenBufferMemory: This type of memory limits the stored context by the number of latest n tokens we want to store, directly optimizing the API cost by making the API cost per transaction constant. Here we chop off the context of LLM to the most recent max_token_limit number of tokens.

from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)

4. ConversationSummaryBufferMemory: This type of memory stores the summary of the conversation instead of the raw conversation. We can also specify the number of tokens in which we want to store the summary. Say we want to store the summary in K tokens, and the total context length we can afford is X, then the last X-K tokens would be made of the raw recent transactions and the first K tokens of context are the summary of the conversations done so far.

5. VectorDataMemory: The context is stored in the form of numerical vector representations/embeddings of the transactions, and the most relevant chunk of transactions is retrieved to answer a given query.

6. Entity Memories: If there are a lot of entities with varying attributes in your conversation, you want your LLM to remember this is the ideal type of memory buffer to use.

3. Chains 🔗

Chains allow us to group LLM with prompts and carry out the required sequence of operations on our data sources easily. We have the following types of chains:

a. LLMChain✨: This is the most basic type of chain that comprises of a PromptTemplate, LLM, and optional output parser. This chain can take multiple input variables, format them into a prompt using the passed template and then pass it to the LLM, Finally, it optionally parses the output of the model using the defined output parser.

b. SimpleSequentialChain⛓: This chain allows us to group LLMChains sequentially by passing the output of one LLMChain into another. This chain is good for cases when our application or sub-LLMChains has only one input and output.

c. SequentialChain 🖇: This chain is used when we have sub-LLMChains with multiple inputs and outputs. As you can see in the image below, we use an output_key to keep track of all the transactions and their respective outputs.

In addition to the above chains, many other types of chains in LangChain can be used for many other scenarios and use cases.

In the next part of the article, we will cover “Question and Answer over documents using LangChain”, “Agents”, and “Evaluating LLM-applications using LangChain”. All these concepts are really cool, nascent, and still in the evolution phase. Exciting times we are living in!
Happy Learning! ❤️

Credits:

--

--

Momal Ijaz
AIGuys
Writer for

Machine Learning Engineer @ Super.ai | ML Reseacher | Fulbright scholar'22 | Sitar Player