Building a Conversational Chat Interface with Streamlit and LangChain for CSVs

Anoop Johny
13 min readAug 18, 2023

Introduction

Are you intrigued by the world of conversational AI and natural language processing? Look no further! This Streamlit app is designed to showcase the capabilities of a conversational chat interface driven by a sophisticated language model and a powerful retrieval-based system. With this app, you can seamlessly engage in interactive conversations with the model, all centered around a given CSV dataset.

Language models have inherently changed the way we consume strucutred information.

User Note: Before diving in, ensure you’ve downloaded and placed the LLAMA 2B model into the designated native directory. You can access the LLAMA 2B model here. (Please note that the model size is 6GBs, making it impractical to include in the repository due to potential size limitations.)

In this tutorial, we will walk through the process of creating a conversational chat interface using the Streamlit library and LangChain, a Python library for working with language models and embeddings. We will use a pre-trained language model, embeddings, and a retrieval chain to enable a dynamic and context-preserving chat experience. By the end of this tutorial, you’ll have a functional conversational chat interface that can be integrated into various applications.

Features at a Glance

  • Engage in dynamic and interactive conversations with the language model.
  • Experience retrieval-based responses using powerful embeddings and a streamlined FAISS index.
  • Seamlessly integrate various language models like Llama, Vicuna, Alpaca, and more.
  • Preserve the context of ongoing conversations in the history.

Setting Up the Environment

Before we dive into the code, let’s set up our development environment by installing the required libraries. Make sure you have Python and pip installed, and then run the following commands:

pip install streamlit
pip install streamlit-chat # For chat interface components
pip install langchain

Required Library files

Your requirements.txt should consist of the following library imports for this program:

pypdf
langchain
torch
accelerate
bitsandbytes
transformers
sentence_transformers
faiss-cpu
ctransformers

An overview of each of the required libraries is as follows :

Streamlit

Streamlit is an open-source Python library that allows you to create interactive and data-driven web applications with minimal effort. It is particularly popular among data scientists and developers for its simplicity and ability to turn data scripts into shareable web apps.

Transforming Data Scripts into Interactive Web Apps with Simplicity

With Streamlit, you can easily create user interfaces for data visualization, exploration, and interaction. Its reactive nature automatically updates the interface based on changes in the underlying data or user interactions, making it a powerful tool for rapid application development.

PyPDF

PyPDF is a Python library for working with PDF files. It provides functionality for reading, writing, and manipulating PDF documents.

PyPDF2 is a free and open-source pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files.

With PyPDF, you can extract text and images from PDFs, merge or split PDF files, and perform various other operations on PDF documents.

LangChain

LangChain is a Python library designed for natural language processing (NLP) tasks. It offers a set of tools and components for working with language models, embeddings, document loading, vector stores, and conversational chains.

LangChain is a framework for developing applications powered by language models

LangChain simplifies many aspects of NLP pipeline development, making it easier to create applications that process and generate text.

Torch

Torch is a popular open-source machine learning library primarily used for deep learning tasks.

Illuminating the Landscape of Deep Learning with Versatile Tensor Operations

It provides tensors, mathematical operations, and neural network modules that enable the construction and training of complex machine learning models.

Accelerate

Accelerate is a PyTorch library that provides a high-level API and a set of utilities for easy GPU acceleration.

Enhancing Deep Learning Efficiency through GPU-Powered PyTorch Utilities

It helps streamline the process of using GPUs to train and optimize deep learning models, making computations faster and more efficient.

BitsAndBytes

I couldn’t find specific information about a library named BitsAndBytes. It’s possible that this is a custom or internal library.

In bitsandbytes, setting a Linear8bitLt module's device is a crucial step (if you are curious, you can check the code snippet here)

Please provide more details or context if you have additional information.

Transformers

Transformers is a powerful library developed by Hugging Face that provides easy access to pre-trained language models for tasks like text generation, translation, sentiment analysis, and more.

Bridging the Gap to State-of-the-Art Natural Language Understanding

It offers a wide range of models, such as BERT, GPT, and T5, which can be fine-tuned for specific NLP tasks.

Sentence Transformers

Sentence Transformers is a library built on top of the Transformers library. It focuses on creating high-quality embeddings for sentences, paragraphs, or longer texts.

SentenceTransformers is a Python framework for state-of-the-art sentence, text and image embeddings.

Sentence embeddings are valuable for various NLP tasks, including information retrieval, clustering, and semantic similarity calculations.

FAISS (FAISS-CPU)

FAISS (Facebook AI Similarity Search) is a library for efficient similarity search and clustering of dense vectors. It’s particularly useful for large-scale search and retrieval tasks.

Navigating the High-Dimensional Space for Efficient Vector Similarity Search

FAISS-CPU refers to the CPU version of the library, which allows you to perform vector similarity searches using the CPU rather than a GPU.

CTransformers

CTransformers is a component of the LangChain library that provides tools for working with language models. It includes classes for loading and managing various types of language models, enabling you to create custom pipelines for text generation and other NLP tasks.

Python bindings for the Transformer models implemented in C/C++ using GGML library.

These libraries play crucial roles in various stages of your project, from creating user interfaces, processing PDF files, and managing language models to optimizing computations and performing efficient vector searches. Combining these libraries allows you to build powerful and feature-rich applications that leverage the capabilities of each component.

Theoretical Knowledge

1. Language Model and Retrieval-Based System:

The core of the conversational chat app relies on a sophisticated language model, which is responsible for understanding and generating human-like text. Additionally, a retrieval-based system enhances the model’s responses by leveraging pre-existing data, such as a CSV dataset, to provide contextually relevant replies.

2. Language Model Loading and Configuration:

The app begins by loading the chosen language model. In this case, the LLAMA 2B model is used.

The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations.

The model is loaded with specific configurations, such as model type (“llama”), maximum new tokens, and creativity (temperature). These configurations influence the quality and creativity of the model’s responses.

3. Embeddings and Vectorization:

To process and understand text, the app uses embeddings — numerical representations of words and sentences.

One of the simplest vectorization methods for text is a bag-of-words (BoW) representation.

These embeddings are created using Sentence Transformers, a library built on top of Transformers. Embeddings enable the language model to compare and find similarities between different pieces of text.

4. Vector Store and FAISS Index:

The app leverages a vector store, created using FAISS (Facebook AI Similarity Search), to efficiently store and manage the embeddings of the CSV dataset. The FAISS index enables fast and accurate similarity searches within the dataset, making retrieval-based responses seamless.

5. Conversational Chain:

The heart of the chat experience lies in the Conversational Retrieval Chain. This chain combines the language model, vector store, and retrieval system. It maintains context by keeping track of the ongoing conversation history. When a user submits a query, the Conversational Chain generates a response based on both the language model’s predictions and retrieval-based context.

The UML

showcasing the flow of data and control between components
  1. ConversationalApp:
  • Represents the main application class that orchestrates the entire conversational chat app.
  • Contains private attributes such as uploaded_file, loader, embeddings, db, llm, and more, which correspond to different components and functionalities of the app.
  • Encapsulates methods like conversational_chat(query: str): str, which handles the conversational interaction with the language model, and display_chat_history(), which displays the conversation history in the UI.
  • Provides public methods __init__() for initialization and run_app() for launching the Streamlit app.

2. CSVLoader:

  • Handles the loading and processing of CSV data.
  • Contains private attributes like file_path, encoding, and csv_args.
  • Offers a method load(): DataFrame to load and return data as a DataFrame.

3. HuggingFaceEmbeddings:

  • Manages the creation of embeddings using Sentence Transformers.
  • Holds private attributes such as model_name and model_kwargs.
  • Provides a method create_embeddings(text: str): np.ndarray to generate embeddings for given text.

4. FAISS:

  • Represents the vector store and retrieval system using FAISS.
  • Contains a private attribute vector_store for storing embeddings.
  • Offers methods from_documents(data: DataFrame, embeddings: HuggingFaceEmbeddings): FAISS to create the vector store and save_local(path: str) to save it to a specified path.

5. CTransformers:

  • Handles the loading and utilization of language models.
  • Contains private attributes like model, model_type, max_new_tokens, and temperature.
  • Provides a method generate_response(query: str, chat_history: list): str to generate responses using the loaded language model.

6. ConversationalRetrievalChain:

  • Represents the conversational retrieval chain that combines the language model and the retrieval system.
  • Contains private attributes llm and retriever for the language model and retrieval system, respectively.
  • Offers methods like from_llm(llm: CTransformers, retriever: FAISS): ConversationalRetrievalChain to create the chain and generate_response(query: str, chat_history: list): str to generate responses in a conversation.

7. Relationships:

  • The arrows between classes represent the relationships between them. For example, ConversationalApp --> CSVLoader signifies that the ConversationalApp class uses the functionality provided by the CSVLoader class.

This Plant UML class diagram visually represents how different classes in the code interact and collaborate to create the conversational chat app.

The Code

Importing Libraries

import streamlit as st
from streamlit_chat import message
import tempfile # temporary file
from langchain.document_loaders.csv_loader import CSVLoader # using CSV loaders
from langchain.embeddings import HuggingFaceEmbeddings # import hf embedding
from langchain.vectorstores import FAISS
from langchain.llms import CTransformers
from langchain.chains import ConversationalRetrievalChain

DB_FAISS_PATH = 'vectorstore/db_faiss' # # Set the path of our generated embeddings
  • streamlit is the main library for creating web applications.
  • streamlit_chat provides components for rendering chat messages in Streamlit.
  • tempfile is used to manage temporary files.
  • CSVLoader, HuggingFaceEmbeddings, FAISS, CTransformers, and ConversationalRetrievalChain are components from the LangChain library that help with data loading, embeddings, vector stores, language models, and conversational chains.

Setting File Paths and Loading Language Model

# Loading the model of your choice
def load_llm():
# Load the locally downloaded model here
llm = CTransformers(
model="llama-2-7b-chat.ggmlv3.q8_0.bin",
model_type="llama",
max_new_tokens=512,
temperature=0.5
)
# the model defined, can be replaced with any ... vicuna,alpaca etc
# name of model
# tokens
# the creativity parameter
return llm
  • DB_FAISS_PATH is a path where the vector embeddings will be saved.
  • load_llm() is a function that loads a locally downloaded language model using the CTransformers class from LangChain.

Streamlit User Interface Setup

st.title("Llama2 Chat CSV - 🦜🦙")
uploaded_file = st.sidebar.file_uploader("Upload File", type="csv")
  • st.title() sets the title of the Streamlit application.
  • uploaded_file is a Streamlit component that allows users to upload a CSV file through the sidebar.

Uploading and Preprocessing Data

if uploaded_file:
with tempfile.NamedTemporaryFile(delete=False) as tmp_file:
tmp_file.write(uploaded_file.getvalue())
tmp_file_path = tmp_file.name

loader = CSVLoader(file_path=tmp_file_path, encoding="utf-8", csv_args={'delimiter': ','})
data = loader.load()
  • If a file is uploaded, a temporary file is created to store the uploaded content.
  • The CSVLoader is used to load the CSV data from the uploaded file, with specified encoding and CSV delimiter.

Creating Embeddings and Vector Store

embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', model_kwargs={'device': 'cpu'})
db = FAISS.from_documents(data, embeddings)
db.save_local(DB_FAISS_PATH)
  • embeddings is created using the Hugging Face Sentence Transformers model for vector embeddings.
  • A FAISS vector store (db) is created from the data and embeddings, and it is saved to the specified path.

Context-Preserving Conversational Chain

llm = load_llm()
chain = ConversationalRetrievalChain.from_llm(llm=llm, retriever=db.as_retriever())
  • The language model is loaded.
  • A ConversationalRetrievalChain is created using the loaded language model and the vector store retriever.

User Interface Components for Chat

container = st.container()

with container:
with st.form(key='my_form', clear_on_submit=True):
user_input = st.text_input("Query:", placeholder="Talk to csv data 👉 (:", key='input')
submit_button = st.form_submit_button(label='Send')
  • A container is created to hold the user interface components.
  • Within the container, a form is added with a text input for user queries and a submit button.

Handling User Input and Generating Responses

if submit_button and user_input:
output = conversational_chat(user_input)

st.session_state['past'].append(user_input)
st.session_state['generated'].append(output)
  • When the submit button is clicked and there is user input, the conversational_chat function is called to generate a response.
  • User input and generated response are added to the chat history stored in st.session_state.

Displaying Chat History

if st.session_state['generated']:
with response_container:
for i in range(len(st.session_state['generated'])):
message(st.session_state["past"][i], is_user=True, key=str(i) + '_user', avatar_style="big-smile")
message(st.session_state["generated"][i], key=str(i), avatar_style="thumbs")
  • If chat messages have been generated, they are displayed in the user interface using the message component from streamlit_chat.
  • Chat history is retrieved from st.session_state and displayed in the appropriate format.

Demo

Our sample csv

Snapshot of the CSV i used for query demonstration purposes.

Upload csv here :

Streamlit sidebar allows for uploading any CSV.

Asking questions about the csv

Query section 1
Query section 2

As you can see you can get reasonable answers for the query depending on how detailed your CSV is.

The whole code

you can yoink the code from here.

# Import necessary libraries
import streamlit as st
from streamlit_chat import message
import tempfile
from langchain.document_loaders.csv_loader import CSVLoader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import CTransformers
from langchain.chains import ConversationalRetrievalChain

# Define the path for generated embeddings
DB_FAISS_PATH = 'vectorstore/db_faiss'

# Load the model of choice
def load_llm():
llm = CTransformers(
model="llama-2-7b-chat.ggmlv3.q8_0.bin",
model_type="llama",
max_new_tokens=512,
temperature=0.5
)
return llm

# Set the title for the Streamlit app
st.title("Llama2 Chat CSV - 🦜🦙")

# Create a file uploader in the sidebar
uploaded_file = st.sidebar.file_uploader("Upload File", type="csv")

# Handle file upload
if uploaded_file:
with tempfile.NamedTemporaryFile(delete=False) as tmp_file:
tmp_file.write(uploaded_file.getvalue())
tmp_file_path = tmp_file.name

# Load CSV data using CSVLoader
loader = CSVLoader(file_path=tmp_file_path, encoding="utf-8", csv_args={'delimiter': ','})
data = loader.load()

# Create embeddings using Sentence Transformers
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', model_kwargs={'device': 'cpu'})

# Create a FAISS vector store and save embeddings
db = FAISS.from_documents(data, embeddings)
db.save_local(DB_FAISS_PATH)

# Load the language model
llm = load_llm()

# Create a conversational chain
chain = ConversationalRetrievalChain.from_llm(llm=llm, retriever=db.as_retriever())

# Function for conversational chat
def conversational_chat(query):
result = chain({"question": query, "chat_history": st.session_state['history']})
st.session_state['history'].append((query, result["answer"]))
return result["answer"]

# Initialize chat history
if 'history' not in st.session_state:
st.session_state['history'] = []

# Initialize messages
if 'generated' not in st.session_state:
st.session_state['generated'] = ["Hello ! Ask me(LLAMA2) about " + uploaded_file.name + " 🤗"]

if 'past' not in st.session_state:
st.session_state['past'] = ["Hey ! 👋"]

# Create containers for chat history and user input
response_container = st.container()
container = st.container()

# User input form
with container:
with st.form(key='my_form', clear_on_submit=True):
user_input = st.text_input("Query:", placeholder="Talk to csv data 👉 (:", key='input')
submit_button = st.form_submit_button(label='Send')

if submit_button and user_input:
output = conversational_chat(user_input)
st.session_state['past'].append(user_input)
st.session_state['generated'].append(output)

# Display chat history
if st.session_state['generated']:
with response_container:
for i in range(len(st.session_state['generated'])):
message(st.session_state["past"][i], is_user=True, key=str(i) + '_user', avatar_style="big-smile")
message(st.session_state["generated"][i], key=str(i), avatar_style="thumbs")

The Conclusion

In conclusion, the Conversational Chat App built using Streamlit showcases a dynamic and interactive conversational interface powered by advanced language models and retrieval systems. This app enables users to engage in meaningful conversations with a language model using a simple yet intuitive web-based platform. By combining various cutting-edge libraries and technologies, the app offers a seamless user experience and opens up possibilities for innovative applications in natural language processing.

The app’s core functionalities revolve around loading and processing CSV data, generating embeddings for text using HuggingFace’s Sentence Transformers, creating a retrieval-based system with FAISS, and integrating it with a language model through LangChain’s CTransformers and ConversationalRetrievalChain. The user can upload a CSV file, initiate conversations by entering queries, and observe the chat history unfold with responses from the language model.

Bye Bye!!

One of the standout features of the app is its ability to preserve context within conversations, ensuring that the user’s history informs subsequent interactions. This context preservation is achieved through the ConversationalRetrievalChain, which combines the language model and the retrieval system to generate coherent and relevant responses.

Through its modular design, the app offers flexibility for customization. Developers can easily adjust the language model, fine-tune retrieval parameters, and modify UI components to suit their specific needs. Additionally, the app’s integration with Streamlit allows for hassle-free deployment and sharing with a wider audience.

By leveraging the power of Streamlit, HuggingFace’s models, and LangChain’s tools, the Conversational Chat App demonstrates the potential of natural language understanding and generation. It serves as a stepping stone for building sophisticated conversational AI systems, chatbots, customer support interfaces, and more. As technology continues to advance, the app stands as a testament to the exciting possibilities that lie ahead in the field of conversational artificial intelligence.

Hope you enjoyed reading this article and learned something!!
Thanks for reading 😊👍

References

--

--

Anoop Johny

Masters Life Science Informatics student at Bonn University