Using langchain for large language model application development

Step by step guide to create LLM applications using langchain

8 min readJul 24, 2023

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) tasks. They have transformed the way we interact with and process text-based data. These powerful AI models, like OpenAI’s GPT-4, have transformed the way to comprehend and generate human-like text leading to a multitude of groundbreaking applications across various industries.

LangChain is an open-source framework for building applications powered by Large Language Models such as GPT. It enables applications to connect a language model to other sources of data and allows a language model to interact with its environment.

In this blog, we will be discussing the application of Langchain for LLM-based application development. By prompting an LLM, it is now possible to develop AI applications much faster than ever before. An LLM-based application requires multiple prompting and output parsing and so we need to write a lot of code for this. LangChain makes this development process much easier by utilizing fundamental abstractions found in NLP application development. The content of the blog is mainly based on the short course LangChain for LLM Application Development.

Langchain for large language model development

Overview of Langchain Framework

Langchain is an open-source framework for developing applications. It combines Large Language Models (LLMs) like GPT-4 with external data. Langchain is available in Python or JavaScript (TypeScript) packages. Langchain focuses on composition and modularity. It has modular Components wherein individual components can be used in conjunction with each other or by themselves. Langchain can be applied in multiple use cases, and it can combine its modular components for more end-to-end applications.

Key Components of LangChain

LangChain emphasizes flexibility and modularity. It divides the natural language processing pipeline into separate modular components, enabling developers to tailor workflows according to their needs. The Langchain framework can be divided into six modules, with each module allowing for a different aspect of the interaction with the LLM.

Models:
– LLMs — 20+ integrations
– Chat Models
– Text Embedding Models — 10+ integrations
Prompts:
– Prompt Templates
– Output Parsers — 5+ integrations
– Example Selectors — 10+ integrations
Indexes:
– Document Loaders: 50+ integrations
– Text Splitters: 10+ integrations
– Vector Spaces: 10+ integrations
– Retrievers: 5+ integrations/implementations
Chains:
– Prompt + LLM + Output parsing
– Can be used as a building block for longer chains
– More application-specific chains: 20+ types
– Retrievers: 5+ integrations/implementations
Agents:
– Agents are a type of end-to-end use case which uses the model as a reasoning engine
– Agent Types: 5+ type
– Agent Toolkits: 10+ implementations

Models

Models are the core element of any language model application. Model refers to the language models underpinning an LLM. LangChain gives the building blocks to interface with any language model. LangChain provides interfaces and integrations for two types of models:

LLMs— Models that take a text string as input and return a text string
Chat Models — Models that are backed by a language model but take a list of Chat Messages as input and return a Chat Message

# This is langchain's abstraction for chatGPT API Endpoint
from langchain.chat_models import ChatOpenAI

# To control the randomness and creativity of the generated text by an LLM, 
# use temperature = 0.0
chat = ChatOpenAI(temperature=0.0)

Prompts

Prompts are the new way to programme models. A prompt refers to the style of creating inputs to pass into the model. Prompts are often constructed from multiple components. Prompt templates and Example selectors provide main classes and functions to construct and work with prompts easily.

We will define a template string and create a prompt template using this template string and ChatPromptTemplate from langChain.

Prompt Template

# Define a template string
template_string = """Translate the text that is delimited by triple backticks \
into a style that is {style}. text: ```{text}```
"""

# Create a prompt template using above template string
from langchain.prompts import ChatPromptTemplate
prompt_template = ChatPromptTemplate.from_template(template_string)

The above prompt_template has 2 fields, namely style and text. We can also extract the original template string from this prompt template. Now, if we want to translate a text into some other style, we need to define our style and text for translation.

customer_style = """American English in a calm and respectful tone
"""

customer_email = """
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help right now, matey!
"""

Here, we set the style to American English in a calm and respectful tone. We specified the prompt using an f-string with the instructions to translate the text that is delimited by triple backticks into a style, and then we pass the above style(customer style) and text(customer email) into LLM for the text translation.

# customer_message will generate the prompt and it will be passed into 
# the llm to get a response. 
customer_messages = prompt_template.format_messages(
                    style=customer_style,
                    text=customer_email)

# Call the LLM to translate to the style of the customer message. 
customer_response = chat(customer_messages)

As we build sophisticated applications, prompts can become quite long and detailed. We don't use f string and instead prompt templates because prompt templates are useful abstractions to help us reuse good prompts. We can create prompt templates and reuse that prompt templates and specify the output style and text for the model to work upon.

Langchain provides prompts for some common operations, such as summarization or question answering or connecting to SQL databases or connecting to different APIs. So by using some of langchain’s built-in prompts, we can quickly get an application working without needing to, engineer our own prompts.

Output Parser

Another aspect of langchain’s prompt libraries is that it also supports output parsing. Output parsers help to get structured information from the output of a language model. Output Parser involves taking the output of models and parsing it into a more structured format so that we can perform downstream tasks with the output.

When we are building complex applications using LLMs, we often instruct LLMs to generate their output in certain formats, such as using specific keywords. Langchain’s library functions parse the LLM’s output assuming that it will use certain keywords.

We can have an LLM output JSON and we will have an LLM output JSON and use langchain to parse that output as shown in the following example:

We need to first define how we would like LLM output to be formatted. In this case, we defined a Python dictionary which has fields mentioning whether or not a product is a gift, the number of days it took to deliver and whether the price value was affordable or not.

# Following is one example of the desired output.
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

We can have customer review in a triple backtick as mentioned below. We can define the following review template.

# This is an example of customer review and a template that try to get the desired output
customer_review = """\
Need to be actual review
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

# We will wrap all review template, customer review in langchain to get output 
# in desired format. We will have prompt template created from review template.

from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(review_template)
print(prompt_template)

# Create messages using prompt templates created earlier and customer review. 
# Finally, we pass messgaes to OpenAI endpoint to get response.

messages = prompt_template.format_messages(text=customer_review)
chat = ChatOpenAI(temperature=0.0)
response = chat(messages)
print(response.content)

The above response is still not a dictionary but a string. We need to parse the LLM output string into a dictionary using a Python dictionary. We need to define ResponseSchema for each field item in the Python dictionary. For the sake of brevity, I am not putting the code snippet for these. These can be found in my github notebook. This is a very good way to take LLM output and parse it into a Python dictionary making it easier to use in downstream processing.

ReAct Framework

In the above example, LLM uses keywords such as Thought, Action, and Observation to carry out the chain of thought reasoning using a framework called ReAct. Thought is what LLM thinks and by giving an LLM space to think, LLM can get more accurate conclusions. Action is a keyword to carry out specific action and Observation is a keyword to show what LLM learnt from specific action. If we have a prompt that instructs LLM to use these specific keywords such as Thought, Action and Observation, then these keywords can be coupled with a parser to extract the text that has been tagged with these keywords.

Memory

Large Language Models are not able to remember any of the previous conversations.

When you interact with these models, naturally they don’t remember what you say before or any of the previous conversations, which is an issue when you’re building some applications like Chatbot and you want to have a conversation with them.

With model, prompts and parsers, we can reuse our own prompt templates, share a prompt template with others or use LangChain’s built-in prompt templates, which can be coupled with an output parser so that we get output in a specific format and let parser parse that output to store in a specific dictionary or some other data structure, that makes it easier for downstream processing.

I will be discussing chains and agents in my next blog. I will also discuss how to do question answering in our own data in my other blog as well. Finally, we can see that by prompting an LLM or large language model, it is now possible to develop AI applications much faster than ever before. But an application can require prompting an LLM multiple times and parsing its output, so there’s a lot of glue code that needs to be written. Langchain helps to make this process easier.