LangChain Basics — Part 1

Sze Zhong LIM
Data And Beyond
Published in
5 min readJan 1, 2024

I recently took the LangChain for LLM Application Development course which is free to enroll at DeepLearning.ai. This article will be a mix of a summary / self reflection / further exploration on the code itself.

Photo by Joshua Hoehne on Unsplash

Models, Prompts and Output Parsers

The original code can be found as below.

Summary: The lesson focused on showing how one could make direct API calls to OpenAI, and use LangChain to help with structured prompting. It also showed how from the output of a string from OpenAI, we could get LangChain to help us get a parsable output. After the lesson, I had a few thoughts on how by simply using the ChatPromptTemplate with for loops and conditionals, we could help users with simple prompts get a better response. It would still require some prompt engineering from the backend but it will help make the overall user experience much better, especially since the business will know their own product / services / internal systems better.

openai.ChatCompletion.create() provides a connection to OpenAI that provides a response based on your prompt.

Now instead of using the openai module, we will use the langchain.chat_models.ChatOpenAI to connect to OpenAI.

ChatPromptTemplate.from_template() provides a template that we can reuse. It takes in a template string, and will find the input arguments for the next step.

ChatPromptTemplate Part 1

We can then use .format_message() on the prompt template which returns a list.

ChatPromptTemplate Part 2
ChatPromptTemplate Part 3. Just another example.

The StructuredOutputParser takes in a list of ResponseSchema using from_response_schemas() . You can then call a function of get_format_instructions() to output a Markdown code snippet formatted in json format.

Output Parsers (Part 1)

It is important to note that StructuredOutputParser takes in a list so even if it is just one ResponseSchema, we need to put it into a list.

Output Parsers (Part 2)

My own record can be found below.

Memory

The original code can be found as below.

Summary: The lesson focused on showing how one could provide the model with the whole context of the conversation instead of only a few inputs. It also shows how we can use the model to generate a summary that will help as a workaround on token limit. My thoughts on this topic is that for the summary generated, the context is only as good as how the model interpreted the context. It is possible that the model interpreted the context wrongly and came out with the wrong context. It might be even better if there is a countercheck with the human on the content of the summary just to get quality results.

Using ConversationChain, we can put in a memory buffer so it remembers the previous conversation. The memory buffer is ConversationBufferMemory.

Putting the ConversationBufferMemory into the ConversationChain

We can see what a ConversationBufferMemory does from below. It basically stores the saved context as part of the memory so that the AI can respond to the whole conversation and understand the whole context.

We can use ConversationBufferWindowMemory to limit the number of context saved. We can see from the below that when we set k=1, it only remembers that last context saved.

We can change the Human / AI context to other names that we want by adjusting the ai_prefix and human_prefix argument. It is also noted that the input key is not really important in this particular context.

ConversationTokenBufferMemory is similar to ConversationBufferWindowMemory with the difference that the TokenBuffer is based on the number of tokens. This example uses tiktoken which can be found here. In short, tiktoken is a fast Byte Pair Encoding (BPE) tokenizer for use with OpenAI models. Below are some examples of how different token limits affect the memory.

ConversationSummaryBufferMemory is similar to ConversationTokenBufferMemory with the difference that if the token limit is less than the actual words in the conversation, it will get the model to summary the conversation that is too long such that the whole memory is within the token limit.

We can do a comparison using the example from the ConversationTokenBufferSummary earlier. We can see that when the token limit is higher than the actual tokens, the conversation history remains exactly the same. However, when the actual tokens is higher than the token limit, it will come out with an additional “system” summary. In this particular context, the system summary generated what the model generated, which was that there was not enough info for a summary.

My own record can be found below.

--

--