HR Chatbot with LangChain and Chainlit

Reflections on AI
10 min readJul 18, 2023

--

Photo by Christina @ wocintechchat.com on Unsplash

In this story we will see how you can create a human resources chatbot using LangChain and Chainlit. This chatbot answers questions about employee related policies on topics, like e.g. maternity leave, hazard reporting or policies around training, code of conduct and more.

Since this project was created as a POC, we decided to give it a spin by hijacking the prompts used in this tool, so that the chatbot would tell some occasional jokes whilst answering the questions. So this chatbot should have a humourous spin and some sort of “personality”.

About Chainlit

Chainlit Logo

Chainlit is an open source Python / Typescript library that allows developers to create ChatGPT like user interfaces quickly. It allows you to create a chain of thoughts and then add a pre-built, configurable chat user interface to it. It is excellent for web based chatbots.

Chainlit is much better suited for this task than Streamlit which requires much more work to configure the UI components.

The Chainlit code for the library is here:

Chainlit has two main components:

  • back-end: it allows to interact with libraries like LangChain, Llama Index and LangFlow and is Python based.
  • front-end: it is a Typescript based React application using material UI components.

A Quick Tour of Our HR Chatbot

Our Chatbot has a UI which initially looks like this in light mode:

Initial Chatbot UI

You can then type your question and the result is shown below (using dark mode):

Question on normal working hours

The UI shows you not only the question and the answer, but also the source files. If the text was found, the pdf files are also clickable and you can view their content.

If you expand the steps of the LangChain chain this is what you see:

Thought chain representaion

The user interface (UI) also has a searchable history:

Searchable history

You can easily switch between light and dark mode too:

Open settings
Dark mode switch

A HR Chatbot that can tell Jokes

We have manipulated the chatbot to tell jokes, especially when he cannot answer a question based on the available knowledge database. So you might see responses like this one:

Joke when no source found

Not a great joke, but nevertheless …

Chain Workflow

There are two workflows in this application:

  • setup workflow — used to setup the vector database (in our case FAISS) representing a collection of text documents
  • user interface workflow — the thought chain interactions

Setup Workflow

You can visualize the setup using the following BPMN diagram:

Setup workflow

These are the setup workflow steps:

  • The code starts by listing all PDF documents in a folder.
  • The text of each page of the documents is extracted.
  • The text is sent to Open AI embeddings API.
  • A collection of embeddings is retrieved.
  • The result is accumulated in memory.
  • The collections of accumulated embeddings is persisted to disk.

User interface workflow

This is the user interface workflow:

User interface workflow

Here are the user workflow steps:

  • The user asks a question
  • A similarity search query is executed against the vector database (in our case FAISS — Facebook AI Similarity Search)
  • The vector database returns typically up to 4 documents.
  • The returned documents are sent as context to ChatGPT (model: gpt-3.5-turbo-16k) together with the question.
  • ChatGPT returns the answer
  • The answer gets displayed on the UI.

Chatbot Code Installation and Walkthrough

The whole chatbot code can be found in this repository:

Installation

After cloning the repository you will need to have Conda installed and then run the following commands to installed the necessary libraries:

conda create -n langchain_chainlit python=3.11
conda activate langchain_chainlit
pip install langchain
pip install python-dotenv
pip install openai
pip install faiss-cpu
pip install tiktoken
pip install chainlit
pip install pdfminer
pip install pypdfium2
pip install prompt_toolkit

Then you need also to create a .env file with the following key value pairs:

OPENAI_API_KEY=<open ai key>
DOC_LOCATION=<absolute path of the pdf files>
FAISS_STORE=<Location of the FAISS internal files>
HUMOUR=<true|false>

Make sure that you have some PDF files in the DOC_LOCATION folder.

Running the application

The command to run the application is this one:

chainlit run hr_chatbot_chainlit.py --port 8081

Code Walkthrough

The configuration of most parameters of the application is in file:

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from pathlib import Path
import os

from dotenv import load_dotenv
load_dotenv()

class Config():
faiss_persist_directory = Path(os.environ['FAISS_STORE'])
if not faiss_persist_directory.exists():
faiss_persist_directory.mkdir()
embeddings = OpenAIEmbeddings(chunk_size=100)
model = 'gpt-3.5-turbo-16k'
# model = 'gpt-4'
llm = ChatOpenAI(model=model, temperature=0)
search_results = 5

In this file we set the FAISS persistence directory, the type of embeddings (“text-embedding-ada-002”, the default option) and the model (“gpt-3.5-turbo-16k”).

The text is extracted and the embeddings are processed in this file:

This is the function which extracts the PDF text per page with the source file and page metadata:

def load_pdfs(path: Path) -> List[Document]:
"""
Loads the PDFs and extracts a document per page.
The page details are added to the extracted metadata

Parameters:
path (Path): The path where the PDFs are saved.

Returns:
List[Document]: Returns a list of values
"""
assert path.exists()
all_pages = []
for pdf in path.glob("*.pdf"):
loader = PyPDFium2Loader(str(pdf.absolute()))
pages: List[Document] = loader.load_and_split()
for i, p in enumerate(pages):
file_name = re.sub(r".+[\\/]", '', p.metadata['source'])
p.metadata['source'] = f"{file_name} page {i + 1}"
all_pages.extend(pages)
logger.info(f"Processed {pdf}, all_pages size: {len(all_pages)}")
log_stats(all_pages)
return all_pages

And this is the function which generates the embeddings:

def generate_embeddings(documents: List[Document], path: Path) -> VST:
"""
Receives a list of documents and generates the embeddings via OpenAI API.

Parameters:
documents (List[Document]): The document list with one page per document.
path (Path): The path where the documents are found.

Returns:
VST: Recturs a reference to the vector store.
"""
try:
docsearch = FAISS.from_documents(documents, cfg.embeddings)
docsearch.save_local(cfg.faiss_persist_directory)
logger.info("Vector database persisted")
except Exception as e:
logger.error(f"Failed to process {path}: {str(e)}")
if 'docsearch' in vars() or 'docsearch' in globals():
docsearch.persist()
return docsearch
return docsearch

The file which then initializes the vector store and creates a LangChain question answer chain is this one:

The most important functions in this file are:

def load_embeddinges() -> Tuple[VST, List[Document]]:
"""
Loads the PDF documents to support text extraction in the Chainlit UI.
In case there are no persisted embeddings, the embeddings are generated.
In case the embeddings are persisted, then they are loaded from the file system.

Returns:
Tuple[VST, List[Document]]: Recturs a reference to the vector store and the list of all pdf files.
"""
embedding_dir = cfg.faiss_persist_directory
logger.info(f"Checking: {embedding_dir}")
doc_location: str = os.environ["DOC_LOCATION"]
documents = load_pdfs(Path(doc_location))
assert len(documents) > 0
if embedding_dir.exists() and len(list(embedding_dir.glob("*"))) > 0:
logger.info(f"reading from existing directory: {embedding_dir}")
docsearch = FAISS.load_local(embedding_dir, cfg.embeddings)
return docsearch, documents
return generate_embeddings(documents, doc_location), documents

This function loads the PDF documents to support text extraction in the Chainlit UI. In case there are no persisted embeddings, the embeddings are generated. In case the embeddings are persisted, then they are loaded from the file system.

This strategy avoids calling the embedding API too often, thus saving money.

And this is the function which loads the QA chain:

def create_retrieval_chain(docsearch: VST, verbose: bool=False, humour: bool=True) -> RetrievalQAWithSourcesChain:

"""
Creates the QA chain with memory and if humour is true with a manipulated prompt that tends to create jokes on certain occasions.

Parameters:
docsearch (VST): A reference to the vector store.
verbose (bool): Determines whether LangChain's internal logging is printed to the console or not.
humour (bool): Determines whether the prompt for answers with jokes is used or not.

Returns:
RetrievalQAWithSourcesChain: The QA chain
"""
memory = KeySourceMemory(llm=cfg.llm, input_key='question', output_key='answer')
chain_type_kwargs = {}
if verbose:
chain_type_kwargs['verbose'] = True
if humour:
chain_type_kwargs['prompt'] = HUMOUR_PROMPT
search_retriever: VectorStoreRetriever = docsearch.as_retriever()
search_retriever.search_kwargs = {'k': cfg.search_results}
qa_chain = RetrievalQAWithSourcesChain.from_chain_type(
cfg.llm,
retriever=search_retriever,
chain_type="stuff",
memory=memory,
chain_type_kwargs=chain_type_kwargs
)

return qa_chain

This function creates the QA chain with memory and in case the humour parameter is true, then a manipulated prompt — that tends to create jokes on certain occasions — is used.

We had to inherit the ConversationSummaryBufferMemory memory class for the memory not to throw an error related to not finding a key.

Here is an extract of the prompt text:

Given the following extracted parts of a long document and a question, create a final answer with references (“SOURCES”). If you know a joke about the subject, make sure that you include it in the response.

If you don’t know the answer, say that you don’t know and make up some joke about the subject. Don’t try to make up an answer.

ALWAYS return a “SOURCES” part in your answer.

The actual part of the code related to ChainLit is in this file:

The Chainlit library works with Python decorators. and the initialization of the LangChain QA chain is done inside of a decorated function with:

@cl.langchain_factory(use_async=True)

Here is the function:

@cl.langchain_factory(use_async=True)
async def init():

"""
Loads the vector data store object and the PDF documents. Creates the QA chain.
Sets up some session variables and removes the Chainlit footer.

Parameters:
use_async (bool): Determines whether async is to be used or not.

Returns:
RetrievalQAWithSourcesChain: The QA chain
"""
msg = cl.Message(content=f"Processing files. Please wait.")
await msg.send()
docsearch, documents = load_embeddinges()

humour = os.getenv("HUMOUR") == "true"

chain: RetrievalQAWithSourcesChain = create_retrieval_chain(docsearch, humour=humour)
metadatas = [d.metadata for d in documents]
texts = [d.page_content for d in documents]
cl.user_session.set(KEY_META_DATAS, metadatas)
cl.user_session.set(KEY_TEXTS, texts)
remove_footer()
await msg.update(content=f"You can now ask questions about Onepoint HR!")#
return chain

This function loads the vector store using load_embeddinges(). The QA chain is then initialized by the function create_retrieval_chain and returned.

The last method converts the LangChain result dictionary to a Chainlit Message object. Most of the code in this method tries to extract the sources which sometimes come in unexpcted formats in the text:

@cl.langchain_postprocess
async def process_response(res) -> cl.Message:
"""
Tries to extract the sources and corresponding texts from the sources.

Parameters:
res (dict): A dictionary with the answer and sources provided by the LLM via LangChain.

Returns:
cl.Message: The message containing the answer and the list of sources with corresponding texts.
"""
answer = res["answer"]
sources = res["sources"].strip()
source_elements = []

# Get the metadata and texts from the user session
metadatas = cl.user_session.get(KEY_META_DATAS)
all_sources = [m["source"] for m in metadatas]
texts = cl.user_session.get(KEY_TEXTS)

found_sources = []
if sources:
logger.info(f"sources: {sources}")
raw_sources, file_sources = source_splitter(sources)
for i, source in enumerate(raw_sources):
try:
index = all_sources.index(source)
text = texts[index]
source_name = file_sources[i]
found_sources.append(source_name)
# Create the text element referenced in the message
logger.info(f"Found text in {source_name}")
source_elements.append(cl.Text(content=text, name=source_name))
except ValueError:
continue
if found_sources:
answer += f"\nSources: {', '.join(found_sources)}"
else:
answer += f"\n{sources}"

await cl.Message(content=answer, elements=source_elements).send()

Key Takeaways

Chainlit is part of the growing LangChain ecosystem and allows you to nice looking web based chat applications really quickly. It has some customization options, like e.g. allowing to quickly integrate with authentication platforms or to persist data.

However we found it a bit difficult to remove the “Built with Chainlit” footer note and ended up doing so in a rather “hacky” way, which is probably not very clean. At this point in time it is not really clear how a deep UI customization can be done without creating a fork o using dirty hacks.

Another problem that we faced was how to reliably interpret the LLM output — especially how to extract the sources from the reply. In spite of the prompt telling the LLM to:

create a final answer with references (“SOURCES”)

It does not do that at all times. And this causes some unreliable source extraction.

On the plus side, LLM’s allow you to create chat bots with a flavour if you are willing to change the prompt in creative ways. The jokes delivered by the HR assistant are not that great, however they prove the point that you can create “flavoured” AI assistants, that will eventually be more engaging to end users and more fun.

--

--

Reflections on AI

AI Solutions Engineer at Onepoint Consulting Ltd in London