Document based LLM-Powered Chatbot
Use LLM and Langchain, chroma for vector search to bulid context/document based chatbot
Document-based LLM-powered chatbots are the new trend in the world of conversational interfaces. With advancements in natural language processing (NLP) technologies such as LangChain, Chroma, Vector Search, and OpenAI GPT-3, chatbots can now interpret human language more accurately and effectively than ever before.
Chatbots are automated systems that interact with users through a chat interface. They can be used for various purposes, such as customer service, information retrieval, and even personal assistants. However, their effectiveness depends on the accuracy of their language understanding and response generation capabilities.
Project Link :
Technical Architecture
Document-based LLM-powered chatbots solve this problem by using machine learning algorithms to analyze and understand the context of a conversation. They can also retrieve relevant information from large databases or documents, making them more effective at answering complex questions.
Langchain, Chroma and GPT-3
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app — the real power comes when you are able to combine them with other sources of computation or knowledge. LangChain library is aimed at assisting in the development of those types of applications.With LangChain, chatbots can analyze the meaning of a user’s question, and provide a relevant and accurate response.
pip install langchain
Official documentation: https://langchain.readthedocs.io/en/latest/
Chroma is a vector store and embeddings database designed from the ground-up to make it easy to build AI applications with embeddings. It comes with everything you need to get started built in, and runs on your machine — just
pip install chromadb
Official documentation: https://github.com/chroma-core/chroma
Vector search is a machine learning technique that is used to search for relevant information in a large database. With vector search, chatbots can quickly retrieve relevant information from large documents, making them more effective at answering complex questions.
OpenAI GPT-3 is a natural language processing model that can generate human-like responses to user queries. It is trained on a vast corpus of text data and can understand the nuances of language, making it ideal for chatbot applications.
Official documentation: https://platform.openai.com/docs/models/gpt-3
Why to incorporate vector search
. To avoid Halluciination: Hallucination is an issue that large language models (LLMs) such as GPT-3 are facing. Hallucination occurs when the model generates text that is not grounded in reality or is inconsistent with the input. In other words, the model generates text that is not supported by the context or the data it was trained on.
. Contextualization: Embedding can help to identify and contextualize the intent behind a user’s query, providing more relevant and personalized responses. Vectors searchcapture the semantic meaning of words and phrases, allowing for more accurate matches between user queries and relevant responses.
Embeddings
Using the Embeddings API and fine-tuning are both techniques to train GPT-3 on separate data, but they serve different purposes and involve different types of training methods.
In the field of natural language processing, embeddings are a way to represent words, phrases, or documents as numerical vectors that capture their meaning and context.
With these learned embeddings from an additional body of knowledge, we can then construct prompts that provide additional context and respond based on that input.
In short, if you have a large body of text, for example, you want to train GPT-3 on a textbook, legal documents, or any other additional body of knowledge, the Embeddings API is the way to go.
Similarity Search
When a user submits a query, Chroma creates an embedding vector for the query text and compares it to the vectors for the documents in the corpus using cosine similarity. The most similar documents are returned as the search results, ordered by their similarity scores.
Cosine similarity — This method measures the cosine of the angle between two vectors, which indicates how similar they are in direction. Cosine similarity ranges from -1 to 1, with 1 indicating perfect similarity.
The formula for cosine similarity is:
similarity(A, B) = (A . B) / (||A|| ||B||)
Where A and B are the two vectors being compared, . is the dot product of the vectors, and || || represents the Euclidean norm (magnitude) of the vectors.
Implementation
Lets dive into the implementation part , Import necessary libraries:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA
from langchain.document_loaders import DirectoryLoader
from langchain.prompts import PromptTemplate
from langchain.chains.question_answering import load_qa_chain
import config
import logging
The code loads a directory of PDF documents and splits them into chunks of text. It then uses the OpenAI API to embed the text chunks into vectors, and creates a Chroma vector store from the embedded text.
# Load documents from the specified directory using a DirectoryLoader object
loader = DirectoryLoader(config.FILE_DIR, glob='*.pdf')
documents = loader.load()
# split the text to chuncks of of size 1000
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
# Split the documents into chunks of size 1000 using a CharacterTextSplitter object
texts = text_splitter.split_documents(documents)
# Create a vector store from the chunks using an OpenAIEmbeddings object and a Chroma object
embeddings = OpenAIEmbeddings(openai_api_key=config.OPENAI_API_KEY)
docsearch = Chroma.from_documents(texts, embeddings)
The function answer
takes the prompt
as input, and uses it to generate a prompt template, which is then used to load a question-answering (QA) chain. The code selects the top k
text chunks from the Chroma vector store to answer the user's query, and uses the QA chain to find the answer to the query. The answer is returned as the output of the answer
function.
Once the VectorDBQA
object is created, it can be called with a query to generate an answer. This is done using the line of code result = qa({"query": prompt})
, where prompt
is the user's query. The resulting result
object is a dictionary that contains information about the generated answer, including the answer itself. The answer is extracted from the dictionary using the line of code answer = result["result"]
.
# Define a function named 'answer' that takes a string prompt and an optional directory path
# for persisting data. The function returns a string that represents the answer to the prompt.
def answer(prompt: str, persist_directory: str = config.PERSIST_DIR) -> str:
# Log a message indicating that the function has started
LOGGER.info(f"Start answering based on prompt: {prompt}.")
# Create a prompt template using a template from the config module and input variables
# representing the context and question.
prompt_template = PromptTemplate(template=config.prompt_template, input_variables=["context", "question"])
# Load a QA chain using an OpenAI object, a chain type, and a prompt template.
doc_chain = load_qa_chain(
llm=OpenAI(
openai_api_key = config.OPENAI_API_KEY,
model_name="text-davinci-003",
temperature=0,
max_tokens=300,
),
chain_type="stuff",
prompt=prompt_template,
)
# Log a message indicating the number of chunks to be considered when answering the user's query.
LOGGER.info(f"The top {config.k} chunks are considered to answer the user's query.")
# Create a VectorDBQA object using a vector store, a QA chain, and a number of chunks to consider.
qa = VectorDBQA(vectorstore=docsearch, combine_documents_chain=doc_chain, k=config.k)
# Call the VectorDBQA object to generate an answer to the prompt.
result = qa({"query": prompt})
answer = result["result"]
# Log a message indicating the answer that was generated
LOGGER.info(f"The returned answer is: {answer}")
# Log a message indicating that the function has finished and return the answer.
LOGGER.info(f"Answering module over.")
return answer
The code logs several messages using the Python logging module, and uses the config
module to specify configuration parameters such as file paths, API keys, and the number of text chunks to consider for answering the user's query.
OPENAI_API_KEY = "YOUR-OPENAI-API-KEY" # replace with your actual OpenAI API key
PERSIST_DIR = "vectorstore" # replace with the directory where you want to store the vectorstore
LOGS_FILE = "logs/log.log" # replace with the path where you want to store the log file
FILE ="doc/CV.pdf" # replace with the path where you have your documents
FILE_DIR = "doc/"
prompt_template = """You are a personal Bot assistant for answering any questions about documents of Abonia Sojasingarayar.
You are given a question and a set of documents.
If the user's question requires you to provide specific information from the documents, give your answer based only on the examples provided below. DON'T generate an answer that is NOT written in the provided examples.
If you don't find the answer to the user's question with the examples provided to you below, answer that you didn't find the answer in the documentation and propose him to rephrase his query with more details.
Use bullet points if you have to make a list, only if necessary.
QUESTION: {question}
DOCUMENTS:
=========
{context}
=========
Finish by proposing your help for anything else.
"""
k = 4 # number of chunks to consider when generating answer
Combining all of these technologies, document-based LLM-powered chatbots can analyze the meaning of a user’s question, retrieve relevant information from large databases or documents, and generate human-like responses. This makes them a powerful tool for customer service, information retrieval, and even personal assistants.
To built the demo app use the below code:
import chat
import streamlit as st
from streamlit_chat import message
#Creating the chatbot interface
st.title("LLM-Powered Chatbot for Intelligent Conversations")
st.subheader("AVA-Abonia Virtual Assistant")
# Storing the chat
if 'generated' not in st.session_state:
st.session_state['generated'] = []
if 'past' not in st.session_state:
st.session_state['past'] = []
# Define a function to clear the input text
def clear_input_text():
global input_text
input_text = ""
# We will get the user's input by calling the get_text function
def get_text():
global input_text
input_text = st.text_input("Ask your Question", key="input", on_change=clear_input_text)
return input_text
def main():
user_input = get_text()
if user_input:
output = chat.answer(user_input)
# store the output
st.session_state.past.append(user_input)
st.session_state.generated.append(output)
if st.session_state['generated']:
for i in range(len(st.session_state['generated'])-1, -1, -1):
message(st.session_state["generated"][i], key=str(i))
message(st.session_state['past'][i], is_user=True, key=str(i) + '_user')
# Run the app
if __name__ == "__main__":
main()
Find complete code in Github : https://github.com/Abonia1/Context-Based-LLMChatbot
Real-Time Use cases:
- Customer support: It can be used to provide customer support and assistance 24/7. Document-based search can be used to provide relevant answers to customer queries by searching through a database of product manuals, FAQs, and other relevant documents.
- E-commerce: It can be used to assist customers in finding products and making purchases. Document-based search can be used to search through product descriptions and reviews to provide relevant product recommendations.
- Healthcare: It can be used to provide patients with medical advice and help them schedule appointments. Document-based search can be used to search through medical records and research papers to provide relevant information to patients and healthcare professionals.
- Education: It can be used to assist students in finding information and answering their queries. Document-based search can be used to search through textbooks and other educational materials to provide relevant information to students.
- Human resources: It can be used to assist employees in finding information and answering their queries related to HR policies, benefits, and other HR-related matters. Document-based search can be used to search through employee manuals, company policies, and other relevant documents.
- Legal: It can be used to assist clients in finding legal information and answering their queries. Document-based search can be used to search through legal documents and case law to provide relevant information to clients and legal professionals.
Conclusion:
Document-based LLM-powered chatbots are the future of conversational interfaces. With advancements in NLP technologies such as LLM ,vector search , serverless GPUs, chatbots can now interpret human language more accurately and effectively than ever before. As these technologies continue to evolve, we can expect chatbots to become even more intelligent and useful in our daily lives.
Connect with me on Linkedin
Find me on Github
Visit my technical channel on Youtube
Support: Buy me a Cofee/Chai