Implementing Local ChatGPT Using Streamlit

Moto DEI
5 min readJul 21, 2023

--

In this brief post, we’ll delve into the specifics of implementing a ChatGPT chat interface using the OpenAI API on your local machine, powered by Streamlit.

Only three days ago, Meta introduced a new product: Llama2. Though not technically open-source, for the purpose of this article, we will refer to it as such. This new Language Learning Model (LLM) has already begun to stir the waters of the developer world, causing a noticeable shift from the OpenAI API to Llama2 for backend application development. What’s the appeal? Meta’s release notes boast that Llama2 offers accuracy on par with GPT-3.5. Moreover, the model is openly accessible and comes with generous licensing terms. This kind of transparency paves the way for developers to utilize, customize, and commercially deploy their LLM-based services worldwide, mitigating issues related to data confidentiality or API costs.

However, unlike ChatGPT, many models respond to your prompts primarily through programming language. That’s where this article comes in, providing a quick tutorial on setting up a ChatGPT-style user interface using Streamlit in Python. For the sake of simplicity, we’ll stick to using OpenAI’s get-3.5/4 APIs for now — the very ones used in ChatGPT. Down the line, however, I plan to expand the scope of this guide to encompass other LLMs as potential chat backends.

What you can make at the end of this article

In the end, you can make a chat website powered by OpenAI API and Streamlit.

What you can create by the code in this article. Screenshot by the author

I presume it’s unnecessary to reiterate the chat functionality; akin to ChatGPT, the LLM model is capable of engaging in conversation with you.

The interface features selectable options for the LLM model and the temperature parameter when making an API call. It also provides a button to clear the conversation, as well as a display of the accumulated costs associated with the API calls.

At present, only OpenAI models are supported. However, I am optimistic that I’ll be able to incorporate additional models into the LLM options in the future.

Preliminary setup

Install necessary libraries

You need to pip install the following libraries:

langchain==0.0.234
openai==0.27.8
python-dotenv==1.0.0
streamlit==1.24.1

Put it in requirements.txt and run pip install -r requirements.txt

Get the OpenAI API key

You need an Open API key to call API. Get it from here if you haven’t yet and put it in a file named .env like:

OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxx<REPLACE WITH YOUR KEY>

Main code

Prepare a file named app.py with the following contents:

# app.py
from dotenv import load_dotenv, find_dotenv
from langchain.callbacks import get_openai_callback
from langchain.chat_models import ChatOpenAI
from langchain.schema import (SystemMessage, HumanMessage, AIMessage)
import streamlit as st


def init_page():
st.set_page_config(
page_title="Personal ChatGPT"
)
st.header("Personal ChatGPT")
st.sidebar.title("Options")


def init_messages():
clear_button = st.sidebar.button("Clear Conversation", key="clear")
if clear_button or "messages" not in st.session_state:
st.session_state.messages = [
SystemMessage(
content="You are a helpful AI assistant. Respond your answer in mardkown format.")
]
st.session_state.costs = []


def select_model():
model_name = st.sidebar.radio("Choose LLM:",
("gpt-3.5-turbo-0613", "gpt-4"))
temperature = st.sidebar.slider("Temperature:", min_value=0.0,
max_value=1.0, value=0.0, step=0.01)
return ChatOpenAI(temperature=temperature, model_name=model_name)


def get_answer(llm, messages):
with get_openai_callback() as cb:
answer = llm(messages)
return answer.content, cb.total_cost


def main():
_ = load_dotenv(find_dotenv())

init_page()
llm = select_model()
init_messages()

# Supervise user input
if user_input := st.chat_input("Input your question!"):
st.session_state.messages.append(HumanMessage(content=user_input))
with st.spinner("ChatGPT is typing ..."):
answer, cost = get_answer(llm, st.session_state.messages)
st.session_state.messages.append(AIMessage(content=answer))
st.session_state.costs.append(cost)

# Display chat history
messages = st.session_state.get("messages", [])
for message in messages:
if isinstance(message, AIMessage):
with st.chat_message("assistant"):
st.markdown(message.content)
elif isinstance(message, HumanMessage):
with st.chat_message("user"):
st.markdown(message.content)

costs = st.session_state.get("costs", [])
st.sidebar.markdown("## Costs")
st.sidebar.markdown(f"**Total cost: ${sum(costs):.5f}**")
for cost in costs:
st.sidebar.markdown(f"- ${cost:.5f}")


if __name__ == "__main__":
main()

Here, one of the points is that in order to save the chat history, it is using streamlit.session_state on the front end instead of langchain.memory on the back end to save the previous conversations.

Then, run Streamlit with the following command:

streamlit run app.py

Voila! Now you have your chatGPT on your local PC through localhost . You can put this on a web server if you want to serve this beyond your local PC.

Looking Towards the Future

As I gaze ahead, there are several improvements I’d like to explore:

  • Firstly, I’m considering enhancing the model’s ability to reference external documents when responding to questions. This feature would enable users to converse based on the information embedded in the consulted documents, which could range from text files and CSVs, to PDFs and even YouTube videos. Implementing this functionality is relatively straightforward using a Langchain data connection. For those interested in learning more about this, I highly recommend OpenAI’s short course: “Langchain: Chat with Your Data”.
  • Secondly, I plan to expand our backend LLM repertoire to include open-source options like Llama2. By downloading the open-source model locally, users could enjoy more secure, cost-free conversations, as the chat wouldn’t need to send any data via API calls.
  • The importance of LLM fine-tuning cannot be overstated. It’s a crucial step in tailoring its capabilities to specific tasks or domains, optimizing the model’s performance to improve its response accuracy, thereby increasing its effectiveness and utility. OpenAI has a nice page to show how we can fine-tune some of their models.
  • Let’s not forget that ChatGPT isn’t merely about chatting. Plus users also have access to features like web search, plugins, and more. A recent update I find particularly remarkable is the Code Interpreter feature. Not only can it generate programming code in response to our requests (e.g., producing Python code to calculate the Fibonacci series), but it can also execute the code. Furthermore, it enables users to upload a file and have ChatGPT analyze the content. By combining these capabilities, users can simply upload a file of interest and have ChatGPT analyze, find insights, and visualize them. A subsequent extension of the Streamlit chat could potentially merge these functionalities with an open-source LLM and emulate the Code Interpreter. This way, users can run confidential data through the LLM for analysis.

--

--

Moto DEI

Principal Engineer/Data Scientist and Actuary with 20 yrs exp in media, marketing, insurance, and healthcare. https://www.linkedin.com/in/moto-dei-358abaa/