How to develop a chatbot using the open-source LLM Mistral-7B, Lang Chain Memory, ConversationChain, and Flask.

Devvrat Rana
21 min readMay 27, 2024

--

image credit: https://www.freepik.com/ai/image-generator

Introduction:

Welcome to our blog , here we’re diving headfirst into the fascinating realm of memory within Large Language Models (LLMs). In the realm of LLM applications, particularly those with conversational interfaces, the ability to recall information from past interactions is paramount. Imagine chatting with a virtual assistant or a chatbot and wishing it could remember details from your previous conversations. Enter LangChain’s Memory module — the superhero that rescues our chat models from short-term memory limitations.

In the world of conversational AI, memory is the key to creating more natural and contextually relevant interactions. At its core, memory enables LLMs to store and retrieve information about past interactions with users, transforming them from stateless agents to intelligent conversational partners.

But what exactly is the Memory module in LangChain, and why is it so crucial?
In this blog, we’ll embark on a journey to unravel the mysteries of LangChain’s Memory module. We’ll explore its significance in enhancing conversational AI, how it operates, scenarios where it proves indispensable, and most importantly, how to seamlessly integrate it into your LLM applications.

So buckle up as we venture into the heart of LangChain’s Memory module.

In this blog, we’ll explore examples to understand the various types of memory components in LangChain, including:

  1. ConversationBufferMemory
  2. ConversationBufferWindowMemory
  3. ConversationTokenBufferMemory
  4. ConversationSummaryMemory

By the end of this blog, you’ll have a comprehensive understanding of its role in revolutionizing conversational AI and unlocking new possibilities for creating more engaging and intuitive user experiences. Let’s dive in! 🚀✨

Memory System Fundamental Operations:

A memory system must facilitate two fundamental operations: reading and writing.

As we recall, each chain defines a core execution logic that anticipates specific inputs. While some of these inputs originate directly from the user, others may be retrieved from memory. In the course of a single execution, a chain interacts with its memory system twice.

  1. Upon receiving the initial user inputs, but prior to executing the core logic, a chain will read from its memory system and enrich the user inputs.
  2. After executing the core logic, but before returning the answer, a chain will write the inputs and outputs of the current run to memory. This ensures that they can be referenced in future runs.
Image Credit to https://python.langchain.com/docs/modules/memory/

Implementing memory in a system involves making two fundamental design decisions:

1. Storing: The storage of state typically involves maintaining a record of all chat interactions. While not all interactions may be directly utilized, they still need to be stored in some capacity. LangChain’s memory module offers a range of integrations for storing chat messages, spanning from in-memory lists to persistent databases.

2. Querying: Querying involves leveraging data structures and algorithms atop chat messages to retrieve relevant information. This encompasses various approaches, from simply returning recent messages to summarizing past interactions or extracting entities referenced in the current conversation.

Different applications may have distinct requirements for memory querying. Therefore, the memory module should facilitate easy implementation of both basic and custom memory systems to accommodate diverse needs.

LLM:

from langchain.llms import GPT4All
llm = GPT4All(model=r'C:\Users\91941\.cache\gpt4all\mistral-7b-openorca.gguf2.Q4_0.gguf'), #Replace this path with your model path

Token_count:

from langchain.callbacks import get_openai_callback
def count_tokens(chain, query):
with get_openai_callback() as cb:
result = chain.run(query)
print(f'Spent a total of {cb.total_tokens} tokens')

return result

Memory Types: LangChain supports a variety of memory types, encompassing different data structures and algorithms to cater to varied querying needs.

Let’s take a look at what Memory actually looks like in LangChain. Here we’ll cover the basics of interacting with an arbitrary memory class.

  1. ConversationBufferMemory:
    ConversationBufferMemory is an extremely simple form of memory that just keeps a list of chat messages in a buffer and passes those into the prompt template.
    When using memory in a chain, there are a few key concepts to understand. Note that here we cover general concepts that are useful for most types of memory. Each individual memory type may very well have its own parameters and concepts that are necessary to understand.
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("hi! good morning")
memory.chat_memory.add_ai_message("very good morning, what's up?")

Prior to entering the chain, several variables are retrieved from memory. These variables must be named consistently with the variables expected by the chain. You can identify these variables by invoking memory.load_memory_variables({}). It’s important to note that the empty dictionary passed in serves as a placeholder for actual variables. If the memory type you’re utilizing relies on input variables, you may need to supply some.

memory.load_memory_variables({})
Output:

{'history': "Human: hi! good morning\nAI: very good morning, what's up?"}

ConversationBufferMemory Parameter:

a. memory_key:
In this scenario, the load_memory_variables function yields a solitary key, “history.” Consequently, your chain (and potentially your prompt) should anticipate an input labeled “history.” Typically, you can manage this variable through parameters on the memory class. For instance, if you prefer the memory variables to be returned under the key “chat_history,” you can specify:

memory = ConversationBufferMemory(memory_key="chat_history")
memory.chat_memory.add_user_message("hi! good morning")
memory.chat_memory.add_ai_message("very good morning, what's up?")
    {'chat_history': "Human: hi! good morning\nAI: very good morning, what's up?"}

b. return_messages:
One prevalent form of memory is the retrieval of a list of chat messages. These messages can be presented in two formats: either as a single concatenated string, which is beneficial when they will be inputted into Large Language Models (LLMs), or as a list of ChatMessages, which is advantageous when used with ChatModels.

By default, the messages are returned as a single string. However, if you prefer to receive them as a list of messages, you can enable this option by setting return_messages=True.

memory = ConversationBufferMemory(return_messages=True)
memory.chat_memory.add_user_message("hi! good morning")
memory.chat_memory.add_ai_message("very good morning, what's up?")
memory.load_memory_variables({})
{'history': [HumanMessage(content='hi! good morning'),
AIMessage(content="very good morning, what's up?")]}

Function to calculate token

from langchain.callbacks import get_openai_callback

def count_tokens(chain, query):
with get_openai_callback() as cb:
result = chain.run(query)
print(f'Spent a total of {cb.total_tokens} tokens')

return result

Create conversation chain with ConversationBufferMemory and OpenAI “gpt-3.5-turbo” llm.

llm=ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.1,api_key=userdata.get('openaikey'))
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

conversation_buf_mem = ConversationChain(
llm=llm,
memory=ConversationBufferMemory()
)
conversation_buf_mem("Good morning AI!")
{'input': 'Good morning AI!',
'history': '',
'response': 'Good morning! How are you today?'}
count_tokens(
conversation_buf_mem,
"What is Machine Learning"
)
Spent a total of 167 tokens
Machine learning is a subset of artificial intelligence that focuses on the development of algorithms and statistical models that enable computers to learn and make predictions or decisions without being explicitly programmed. It involves the use of data to train algorithms and improve their performance over time. There are different types of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. Would you like more information on a specific aspect of machine learning?
count_tokens(
conversation_buf_mem,
"Please let me know how can I learn AI"
)
Spent a total of 288 tokens
There are many ways to learn AI! You can start by taking online courses or tutorials on platforms like Coursera, Udemy, or Khan Academy. There are also many books and resources available on the subject. Additionally, you can participate in AI workshops, attend conferences, or join AI communities to network with professionals in the field. It's important to practice coding and work on AI projects to gain hands-on experience. Remember, learning AI is a continuous process as the field is constantly evolving. Good luck on your AI learning journey!
count_tokens(
conversation_buf_mem,
"which is cloud is best for AI"
)
Spent a total of 429 tokens
There are several cloud platforms that are commonly used for AI, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. Each of these platforms offers a variety of AI services and tools, such as machine learning models, data analytics, and natural language processing. The best cloud platform for AI depends on your specific needs and preferences. AWS is known for its wide range of AI services and strong machine learning capabilities. Azure is popular for its integration with Microsoft products and services. Google Cloud Platform is known for its advanced AI and machine learning tools. I recommend exploring each platform to see which one aligns best with your AI projects and goals.

Now, Observe, Quickly exhaust a considerable number of tokens, often surpassing the context window limit of even the most sophisticated LLMs available today.

Now, view History to see all past conversations:

print(conversation_buf_mem.memory.buffer)
Human: Good morning AI!
AI: Good morning! How are you today?
Human: What is Machine Learning
AI: Machine learning is a subset of artificial intelligence that focuses on the development of algorithms and statistical models that enable computers to learn and make predictions or decisions without being explicitly programmed. It involves the use of data to train algorithms and improve their performance over time. There are different types of machine learning, including supervised learning, unsupervised learning, and reinforcement learning. Would you like more information on a specific aspect of machine learning?
Human: Please let me know how can I learn AI
AI: There are many ways to learn AI! You can start by taking online courses or tutorials on platforms like Coursera, Udemy, or Khan Academy. There are also many books and resources available on the subject. Additionally, you can participate in AI workshops, attend conferences, or join AI communities to network with professionals in the field. It's important to practice coding and work on AI projects to gain hands-on experience. Remember, learning AI is a continuous process as the field is constantly evolving. Good luck on your AI learning journey!
Human: which is cloud is best for AI
AI: There are several cloud platforms that are commonly used for AI, including Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. Each of these platforms offers a variety of AI services and tools, such as machine learning models, data analytics, and natural language processing. The best cloud platform for AI depends on your specific needs and preferences. AWS is known for its wide range of AI services and strong machine learning capabilities. Azure is popular for its integration with Microsoft products and services. Google Cloud Platform is known for its advanced AI and machine learning tools. I recommend exploring each platform to see which one aligns best with your AI projects and goals.

2. ConversationBufferWindowMemory:

The ConversationBufferWindowMemory feature introduces a window to the buffer memory, retaining only the most recent K interactions. While this approach reduces the number of tokens utilized, it also results in the loss of context for any inputs preceding the previous K interactions.

In the following example of ConversationBufferWindowMemory, we will use K=2, which means it will retain only the latest 2 interactions. Here are the details:

from langchain.chains.conversation.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=2)

conversation_buf_win_mem = ConversationChain(
llm=llm,
memory=memory
)
conversation_buf_win_mem("Good morning AI!")
{'input': 'Good morning AI!',
'history': '',
'response': 'Good morning! How are you today?'}
count_tokens(
conversation_buf_win_mem,
"What is Machine Learning"
)
Spent a total of 169 tokens
Machine learning is a subset of artificial intelligence that involves the development of algorithms and statistical models that enable computers to learn from and make predictions or decisions based on data without being explicitly programmed. It uses techniques such as neural networks, decision trees, and support vector machines to analyze and interpret patterns in data. Machine learning is used in a variety of applications, including image and speech recognition, medical diagnosis, financial forecasting, and autonomous vehicles.
count_tokens(
conversation_buf_win_mem,
"Please let me know how can I learn AI"
)
Spent a total of 298 tokens
There are many ways to learn AI! You can start by taking online courses or tutorials on platforms like Coursera, Udemy, or Khan Academy. You can also enroll in a formal degree program in computer science or artificial intelligence. Additionally, there are many books and resources available on AI that you can use to self-study. It's important to practice coding and work on projects to apply what you've learned. Joining AI communities and attending workshops or conferences can also help you stay updated on the latest developments in the field. Good luck on your AI learning journey!

Now, add a third conversation to the ConversationBufferWindowMemory. Observe that it retains only the latest 02 conversations since we set K=2, removing the oldest conversation from memory:

count_tokens(
conversation_buf_win_mem,
"which is cloud is best for AI"
)
Spent a total of 429 tokens
There are several cloud platforms that are commonly used for AI, each with its own strengths and features. Some popular options include Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and IBM Watson. AWS offers a wide range of AI services such as Amazon SageMaker, which provides tools for building, training, and deploying machine learning models. Azure has services like Azure Machine Learning and Cognitive Services for AI applications. GCP offers tools like TensorFlow and Cloud Machine Learning Engine for machine learning projects. IBM Watson provides AI services for natural language processing, image recognition, and more. Ultimately, the best cloud platform for AI will depend on your specific needs and preferences.
print(conversation_buf_win_mem.memory.buffer)
Human: Please let me know how can I learn AI
AI: There are many ways to learn AI! You can start by taking online courses or tutorials on platforms like Coursera, Udemy, or Khan Academy. You can also enroll in a formal degree program in computer science or artificial intelligence. Additionally, there are many books and resources available on AI that you can use to self-study. It's important to practice coding and work on projects to apply what you've learned. Joining AI communities and attending workshops or conferences can also help you stay updated on the latest developments in the field. Good luck on your AI learning journey!
Human: which is cloud is best for AI
AI: There are several cloud platforms that are commonly used for AI, each with its own strengths and features. Some popular options include Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and IBM Watson. AWS offers a wide range of AI services such as Amazon SageMaker, which provides tools for building, training, and deploying machine learning models. Azure has services like Azure Machine Learning and Cognitive Services for AI applications. GCP offers tools like TensorFlow and Cloud Machine Learning Engine for machine learning projects. IBM Watson provides AI services for natural language processing, image recognition, and more. Ultimately, the best cloud platform for AI will depend on your specific needs and preferences.

3. ConversationSummaryMemory:
When utilizing ConversationBufferMemory, we quickly exhaust a considerable number of tokens, often surpassing the context window limit of even the most sophisticated LLMs available today.

To mitigate excessive token usage, we can leverage ConversationSummaryMemory. As its name implies, this memory type summarize the conversation history before it’s passed to the {history} parameter.

Internally, ConversationSummaryMemory employs a two-step process. Initially, it makes a second call to the LLM with a prompt to summarize the conversation history. The resulting summary is then relayed to the history parameter for new input.

However, the performance of this approach hinges entirely on the summarization capabilities of the model.

from langchain.chains.conversation.memory import ConversationSummaryMemory
from langchain import OpenAI
from langchain.chains import ConversationChain
from langchain.chat_models import ChatOpenAI

llm=ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.1,api_key='openaikey')

memory = ConversationSummaryMemory(llm=llm)

conversation_sum = ConversationChain(llm=llm, memory=memory,verbose=True)
print(conversation_sum.memory.prompt.template)
Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:
count_tokens(
conversation_sum,
"good morning"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: good morning
AI:

> Finished chain.
Spent a total of 249 tokens
Good morning! How are you today?
count_tokens(
conversation_sum,
"What is Machine Learning"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
The human greets the AI with "good morning." The AI responds with a friendly "Good morning! How are you today?"
Human: What is Machine Learning
AI:

> Finished chain.
Spent a total of 524 tokens
Machine learning is a type of artificial intelligence that allows computers to learn and improve from experience without being explicitly programmed. It involves algorithms that can analyze data, identify patterns, and make decisions or predictions based on that data. It is used in a wide range of applications, such as image and speech recognition, medical diagnosis, financial forecasting, and more. Would you like more information on a specific aspect of machine learning?
count_tokens(
conversation_sum,
"Please let me know how can I learn AI"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
The human greets the AI with "good morning." The AI responds with a friendly "Good morning! How are you today?" The human asks about machine learning, and the AI explains that it is a type of artificial intelligence that allows computers to learn and improve from experience without being explicitly programmed. It involves algorithms that can analyze data, identify patterns, and make decisions or predictions based on that data, used in various applications like image and speech recognition, medical diagnosis, and financial forecasting.
Human: Please let me know how can I learn AI
AI:

> Finished chain.
Spent a total of 737 tokens
There are many ways you can learn AI! You can start by taking online courses or tutorials on platforms like Coursera, Udemy, or Khan Academy. You can also read books on the subject or attend workshops and seminars. Additionally, you can practice by working on AI projects or participating in AI competitions. It's important to stay updated on the latest developments in the field and continuously improve your skills. Good luck on your AI learning journey!
count_tokens(
conversation_sum,
"which cloud is best for AI"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
The human greets the AI with "good morning." The AI responds with a friendly "Good morning! How are you today?" The human asks about machine learning, and the AI explains that it is a type of artificial intelligence that allows computers to learn and improve from experience without being explicitly programmed. It involves algorithms that can analyze data, identify patterns, and make decisions or predictions based on that data, used in various applications like image and speech recognition, medical diagnosis, and financial forecasting. The human then asks how they can learn AI, and the AI suggests taking online courses, reading books, attending workshops, practicing on projects, and staying updated on the latest developments in the field. Good luck on your AI learning journey!
Human: which cloud is best for AI
AI:

> Finished chain.
Spent a total of 870 tokens
There are several popular cloud platforms that are commonly used for AI, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform. Each of these platforms offers a variety of AI services and tools, including machine learning models, data processing capabilities, and infrastructure for training and deploying AI models. It ultimately depends on your specific needs and preferences, so I recommend exploring each platform to see which one aligns best with your goals and requirements.
conversation_sum.memory.buffer
The human greets the AI with "good morning." The AI responds with a friendly "Good morning! How are you today?" The human asks about machine learning, and the AI explains that it is a type of artificial intelligence that allows computers to learn and improve from experience without being explicitly programmed. It involves algorithms that can analyze data, identify patterns, and make decisions or predictions based on that data, used in various applications like image and speech recognition, medical diagnosis, and financial forecasting. The human then asks how they can learn AI, and the AI suggests taking online courses, reading books, attending workshops, practicing on projects, and staying updated on the latest developments in the field. The AI also mentions popular cloud platforms like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform for AI, recommending exploring each to see which aligns best with specific needs and preferences. Good luck on your AI learning journey and cloud platform exploration!

In this instance, we observe that the model summarized the ongoing conversation before it was provided to the history parameter. However, it’s important to note that there’s a possibility of losing context if the summary fails to capture all relevant information.

4. ConversationSummaryBufferMemory:

The ConversationSummaryBufferMemory combines features from both the ConversationSummaryMemory and the ConversationBufferWindowMemory. It condenses the earliest interactions in a conversation while retaining the most recent tokens, up to the max_token_limit, in their original format.

True to its name, this memory type maintains a summary of past interactions while simultaneously preserving a window of recent interactions in their raw form. This ensures that recent conversations are accessible with full context, while also safeguarding important information from the past in a concise summary. Ultimately, this approach reduces overall token usage.

from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
conversation_sumbuf = ConversationChain(llm=llm, memory=ConversationSummaryBufferMemory(llm=llm,max_token_limit=100),verbose=True)
print(conversation_sumbuf.memory.prompt.template)
Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:
count_tokens(
conversation_sumbuf,
"hi good morning"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: hi good morning
AI:

> Finished chain.
Spent a total of 75 tokens
Good morning! How are you today
count_tokens(
conversation_sumbuf,
"Please let me know how can I learn AI"
)
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: hi good morning
AI: Good morning! How are you today?
Human: Please let me know how can I learn AI
AI:

> Finished chain.
Spent a total of 527 tokens
Learning AI can be a fascinating journey! There are many resources available online such as online courses, tutorials, and books that can help you get started. Some popular platforms for learning AI include Coursera, Udemy, and edX. Additionally, you can also join AI communities and forums to connect with other learners and experts in the field. It's important to have a strong foundation in mathematics, particularly linear algebra and calculus, as well as programming languages like Python. Are there any specific areas of AI you are interested in learning more about?
count_tokens(
conversation_sumbuf,
"What is Machine Learning"
)
  Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human greets the AI and asks how to learn AI. The AI suggests various online resources and platforms for learning AI, emphasizing the importance of a strong foundation in mathematics and programming languages. The AI also encourages the human to connect with AI communities and forums to further their learning.
Human: What is Machine Learning
AI:

> Finished chain.
Spent a total of 477 tokens
Machine learning is a subset of artificial intelligence that focuses on developing algorithms and models that allow computers to learn from and make predictions or decisions based on data. It involves training a machine learning model on a dataset to recognize patterns and make predictions without being explicitly programmed to do so. There are different types of machine learning algorithms, such as supervised learning, unsupervised learning, and reinforcement learning, each with its own unique approach to learning from data.
count_tokens(
conversation_sumbuf,
"which cloud is best for AI"
)
 Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human greets the AI and asks how to learn AI. The AI suggests various online resources and platforms for learning AI, emphasizing the importance of a strong foundation in mathematics and programming languages. The AI also encourages the human to connect with AI communities and forums to further their learning. The human then asks the AI about Machine Learning.
AI: Machine learning is a subset of artificial intelligence that focuses on developing algorithms and models that allow computers to learn from and make predictions or decisions based on data. It involves training a machine learning model on a dataset to recognize patterns and make predictions without being explicitly programmed to do so. There are different types of machine learning algorithms, such as supervised learning, unsupervised learning, and reinforcement learning, each with its own unique approach to learning from data.
Human: which cloud is best for AI
AI:

> Finished chain.
Spent a total of 889 tokens
There are several popular cloud platforms that are commonly used for AI and machine learning projects. Some of the best cloud platforms for AI include Amazon Web Services (AWS) with services like Amazon SageMaker, Microsoft Azure with Azure Machine Learning, Google Cloud Platform with Google Cloud AI Platform, and IBM Watson. Each of these platforms offers a range of tools and services specifically designed for AI and machine learning tasks, so it ultimately depends on your specific needs and preferences. It's a good idea to explore each platform and see which one aligns best with your project requirements.
conversation_sumbuf.memory.buffer
System: The human greets the AI and asks how to learn AI. The AI suggests various online resources and platforms for learning AI, emphasizing the importance of a strong foundation in mathematics and programming languages. The AI also encourages the human to connect with AI communities and forums to further their learning. The human then asks the AI about Machine Learning, to which the AI explains that it is a subset of artificial intelligence focusing on developing algorithms and models for computers to learn from data. The human inquires about the best cloud platform for AI, and the AI lists popular options like Amazon Web Services, Microsoft Azure, Google Cloud Platform, and IBM Watson, highlighting the need to choose based on project requirements.

When implementing this in our previous conversation, we can assign a small value to max_token_limit, and yet the LLM can retain our earlier “aim.”

This is due to the fact that this information is preserved by the “summarization” component of the memory, even if it was overlooked by the “buffer window” component.

Naturally, the advantages and disadvantages of this component are a blend of those from the earlier components on which it is built.

While demanding additional refinement in determining what to summarize and what to retain within the buffer window, the ConversationSummaryBufferMemory provides ample flexibility. It stands out as the sole memory type in our arsenal (thus far) capable of recollecting distant interactions while preserving the most recent interactions in their raw — and consequently, most information-rich — format.

Below are the comparison of number of tokens consumed by each memory types with different parameters, credit for comparison to Pinecone

image credit: https://www.pinecone.io/learn/series/langchain/langchain-conversational-memory/

Put All together to build a chatbot by using ConversationBufferMemory and ConversationChain:

from flask import Flask, request, render_template, jsonify
import openai
#from data.dataprovider import key, hg_key
from langchain.embeddings import HuggingFaceEmbeddings
#from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.vectorstores import FAISS
from langchain_community.llms import HuggingFaceHub
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
from langchain.chains import LLMChain, ConversationalRetrievalChain, StuffDocumentsChain
from langchain_core.prompts import PromptTemplate
from langchain.chains import ConversationChain
import re
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.llms import GPT4All


app = Flask(__name__)

memory = ConversationBufferMemory()
#llm = GPT4All(model=r'C:\Users\91941\.cache\gpt4all\orca-mini-3b-gguf2-q4_0.gguf')
llm = GPT4All(model=r'C:\Users\91941\.cache\gpt4all\mistral-7b-openorca.gguf2.Q4_0.gguf')
"""llm = HuggingFaceHub(
repo_id="HuggingFaceH4/zephyr-7b-beta",
task="text-generation",
model_kwargs={
"max_new_tokens": 512,
"top_k": 30,
"temperature": 0.1,
"repetition_penalty": 1.03,
},
huggingfacehub_api_token= hg_key,# Replace with your actual huggingface token
)"""

from langchain_core.prompts import HumanMessagePromptTemplate, MessagesPlaceholder, SystemMessagePromptTemplate, StringPromptTemplate

# Define your rag chatbot function
def chat_with_rag(message):

conversation_sum = ConversationChain(llm=llm, memory=memory,verbose=True)
result = conversation_sum(message)

return result

# Define your Flask routes
@app.route('/')
def home():
return render_template('bot_1.html')

@app.route('/chat', methods=['POST'])
def chat():
user_message = request.form['user_input']
bot_message = chat_with_rag(user_message)
print(bot_message)
pattern = r"AI: ([\s\S]+?)(?=Human:|$)"
matches = re.findall(pattern, bot_message['response'])

if matches:
last_ai_message = matches[-1].strip()
return jsonify({'response': last_ai_message})

else:

return jsonify({'response': bot_message['response']})

if __name__ == '__main__':
app.run(debug=True)

GitHub:

Conclusion :

In this blog, we’ve provided an in-depth exploration of the LangChain Memory module for developing chatbot applications with conversation history. We discussed four types of LangChain memory: ConversationBufferMemory, ConversationBufferWindowMemory, ConversationTokenBufferMemory, and ConversationSummaryMemory. Additionally, we developed a chatbot using the open-source LLM Mistral-7B, ConversationBufferMemory, ConversationChain, and Flask.

However, our journey doesn’t end here. Stay tuned for our next blog, where we’ll dive into the innovative RAG Graph and RAG Agents, offering further insights into refining and optimizing RAG models for superior performance.

Join us as we continue to uncover the potential of these cutting-edge methodologies in reshaping the landscape of Generative AI.

My other blogs:

1. Creating a Chatbot Using Open-Source LLM and RAG Technology with Lang Chain and Flask

2. Build a Chatbot with Advance RAG System: with LlamaIndex, OpenSource LLM, Flask and LangChain

3. How to build Chatbot with advance RAG system with by using LlamaIndex and OpenSource LLM with Flask…

Reference:

  1. https://www.pinecone.io/learn/series/langchain/langchain-conversational-memory/
  2. https://python.langchain.com/v0.1/docs/modules/memory/
  3. https://v01.api.js.langchain.com/classes/langchain_memory.ConversationSummaryMemory.html

--

--