Build a Custom Trained Chat GPT in 5 Minutes
This article will teach you how to run a local chat GPT trained on your custom data in 5 minutes.
A custom-trained LLM bot can be great for scanning large amounts of data and having a domain-specific conversation. For example, you can train your bot on a company knowledge base, educational materials, financial\ legal, or even sharing your trip ideas for planning your perfect route 🙂
This can be done in 5 minutes; let’s get to work.
Prerequisites:
- This article assumes you have Python 3.*.* and Pip installed (can be downloaded here). To verify this, please open your terminal and type:
python --version
ORpython3 --version
And if you have it correctly set up, you should get back something like the following:
(Make sure to use the correct Python reference throughout this guide consistently)
pip --version
ORpip3 --version
And if you have it correctly set up, you should get back something like the following:
(Make sure to use the correct Pip reference throughout this guide consistently)
- This article assumes you already have an OpenAi account (which can be created here)
- Prepare some data in English; I suggest starting with less than 100MB and slowly adding more. For example, data can be in PDF, CSV, or TXT formats.
Let’s get started
(1) In the terminal:
- Upgrade Pip (Python package manager):
python -m pip install -U pip
- Install the following libraries:
pip install openai gpt_index==0.4.24 langchain==0.0.118 PyPDF2 PyCryptodome gradio
OpenAI — Large Language Model (LLM) to create and train the AI chatbot.
GPT Index — for connecting with your data.
PyPDF2 — for phrasing PDFs.
Gradio — chatbot UI.
Lang Chain — a framework for developing applications powered by language models.
- Now let’s create your bot. Navigate to the folder you want to create the bot in:
mkdir chatbot
cd chatbot
mkdir docs
touch app.py
(2) In your favorite code or text editor, open app.py
, and paste the following code into it, and save it afterward:
from gpt_index import SimpleDirectoryReader, GPTListIndex, GPTSimpleVectorIndex, LLMPredictor, PromptHelper
from langchain.chat_models import ChatOpenAI
import gradio as gr
import sys
import os
os.environ["OPENAI_API_KEY"] = 'Your Secret API Key'
def construct_index(directory_path):
max_input_size = 4096
num_outputs = 512
max_chunk_overlap = 20
chunk_size_limit = 600
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo", max_tokens=num_outputs))
documents = SimpleDirectoryReader(directory_path).load_data()
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper)
index.save_to_disk('index.json')
return index
def chatbot(input_text):
index = GPTSimpleVectorIndex.load_from_disk('index.json')
response = index.query(input_text, response_mode="compact")
return response.response
iface = gr.Interface(fn=chatbot,
inputs=gr.components.Textbox(lines=7, label="Enter your text"),
outputs="text",
title="Custom-trained AI Chatbot")
index = construct_index("docs")
iface.launch(share=True)
(3) Now move your data files to the newly created /chatbot/docs
folder.
(4) In the browser:
- Get a free API key for OpenAi, by navigating to: https://platform.openai.com/account/api-keys
- Sign in if necessary.
- Click the “+ Create new secret key.”
- Give your key a name and click “Create secret Key.”
- Copy the key and save it for the next steps.
- Paste the key in app.py, replacing ‘Your Secret API Key.’
(5) Back in the terminal on the /chatbot/
folder run: python app.py
(6) Open your favorite browser and open this URL: http://127.0.0.1:7860
That’s it. You are done! And should see something like this:
Now it’s up to you to find valuable information to teach your bot and learn how you can benefit from it.