Build Your First Generative AI Chatbot

Learn how to integrate a pretrained LLM with your database to build a chatbot for efficient domain-specific query responses.

Shravankumar Hiregoudar
Slalom Data & AI
9 min readAug 16, 2023

--

Photo by Lucas Clara on Unsplash

In my role as a data scientist engaged in client-facing work, I’ve observed a significant surge in requests for utilizing generative AI (GenAI) to create groundbreaking chatbots. In this piece, I will guide you through the process of constructing a chatbot tailored to a specific domain. This tutorial aims to support you in crafting a chatbot interface adept at handling queries related to a particular domain’s database. By integrating a powerful language model, you can elevate user interactions, resulting in a more captivating and user-centric encounter.

Prerequisites

1. Foundational knowledge

Large language models (LLMs):
These models are constructed using artificial neural networks and undergo (pre-)training via self-supervised and semi-supervised learning, typically incorporating tens of millions to billions of weights. LLMs are trained to efficiently process vast amounts of text data, often collected from the internet, utilizing specialized AI accelerator hardware, enabling parallel processing.

Foundation models (like GPT-3 and PaLM) are LLMs pretrained on large amounts of data that can be used for a wide range of downstream tasks. Because training a foundation model from scratch is complicated, time-consuming, and extremely expensive, only a few institutions have the required training resources.

To put it into perspective, according to a 2020 study from Lambda Labs, training OpenAI’s GPT-3 (with 175 billion parameters) would require 355 years and $4.6 million using a Tesla V100 cloud instance.

Generative AI versus discriminative/predictive AI:
Discriminative models (like support vectors and tree-based algorithms) aim to estimate the likelihood of a specific class label when provided with input data. In contrast, generative models are designed to produce new data samples that resemble the training data. In essence, discriminative models primarily focus on understanding the connection between inputs and outputs, whereas generative models are responsible for learning the underlying data distribution. Consequently, discriminative models often achieve higher accuracy compared to generative models. Nevertheless, generative models possess the unique advantage of generating new data samples, which proves valuable for tasks like data augmentation.

2. Why use an LLM for a chatbot?

Fine-tuned LLM: Power of leveraging domain-specific knowledge base alongside the general knowledge base

The key advantages of building an AI chatbot using an LLM are:

  • The power of leveraging a custom/domain-specific knowledge base (database, tables, documents, or PDFs) alongside the general knowledge base (gigantic, publicly available data).
  • Its extensive understanding of the context and nuances of language; it can understand and generate natural language.
  • In specific scenarios, LLM’s grasp of human language surpasses that of intent-based chatbots, making them more effective at handling multi-touch inquiries from users.

3. Setup (API keys/subscription)

These links will help you set up the keys and endpoints needed to interact with these OpenAI cognitive services:

4. Tools involved

We will use SQLAlchemy to set up a database, Azure OpenAI with LangChain library to perform LLM/AI operations, and Gradio to build UI. In the upcoming section (implementation), we’ll thoroughly explore the roles of each of the various tools required to set up an AI chatbot.

Tools involved in the design of an AI chatbot
Flow (By Author)

requirements.txt contains the list of required libraries that need to be installed:

# perform pip install -r requirements.txt to install the libraries.
openai
langchain
sqlalchemy
gradio
jupyter
load_dotenv

Implementation

We will take a demand-planning example for this use case; if actual data is unavailable or of low volume, it can be procured from open-source repos or synthetically generated using tools such as Mockaroo.

The structure of the project repository appears as follows:

> custom bot
> loader.py
> requirements.txt
> app.py
> .env

In the loader.py, a Python script that demonstrates how to set up a database using SQLAlchemy reads data from an eExcel file (if it’s local), creates a table model, and inserts the data into the database (if it’s local). It also shows how to interact with the database using an SQLAlchemy session and commit the changes.

from langchain import SQLDatabase
import pandas as pd
from sqlalchemy import create_engine, Column, Integer, String, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime
import pandas as pd
from sqlalchemy import create_engine, Column, String, Integer, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker

def create_demand_table(engine, table_name, excel_file):
# Read the Excel file
dataframes = pd.read_excel(excel_file, sheet_name=None)

# Create a base class for the table models
Base = declarative_base()

# Define the table model
class DemandPlanned(Base):
__tablename__ = table_name

KEY = Column(String, primary_key=True)
DU = Column(String)
ORIGIN = Column(String)
DESTINATION = Column(String)
DEMAND_PLANNED_QTY = Column(Integer)
DEMAND_PLANNED_DATE = Column(Date)

# Drop the existing table in the SQLite database, if it exists
Base.metadata.drop_all(engine)

# Create the table in the SQLite database
Base.metadata.create_all(engine)

# Create a session to interact with the database
Session = sessionmaker(bind=engine)

with Session() as session:
# Insert data into the table (db)
demand = dataframes.get('Demand-Planned')
if demand is not None:
demand['DEMAND_PLANNED_DATE'] = pd.to_datetime(demand['DEMAND_PLANNED_DATE']).dt.strftime('%Y-%m-%d')
demand.to_sql(table_name, con=engine, if_exists='append', index=False)

db = SQLDatabase(engine)

# Commit the changes to the production database
session.commit()

# Close the session
session.close()

return db

When it comes to generative AI, data privacy/security is of the utmost importance, and is the fundamental principle on which this tutorial is based. Hence, in this instance, we have chosen to employ the Azure OpenAI service, which offers the required security measures. There are other reliable and secure alternatives, such as Amazon Web Services and Google Cloud Platform. (Refer to Data, privacy, and security for Azure OpenAI Service for detailed security measures.)

Screenshot from Data, privacy, and security for Azure OpenAI Service

In the same repository, create a .env file to store all your keys.

# Azure LLMs
OPENAI_API_KEY=**
OPENAI_API_BASE=**
OPENAI_API_VERSION=**
OPENAI_CHAT_API_VERSION=**
OPENAI_API_TYPE=azure

# Public LLMs
OPENAI_API_KEY=
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_VERSION=2020-11-07
OPENAI_API_TYPE=

# ** --> must be added

Now, the main file that we are going to run will be the app.py file:

import os
import gradio as gr

from langchain.llms import AzureOpenAI
from langchain.chat_models import AzureChatOpenAI
from langchain.chains import SQLDatabaseChain
from langchain.agents import Tool, AgentType, initialize_agent
from langchain.memory import ConversationBufferMemory
from langchain.prompts.prompt import PromptTemplate

from load_dotenv import load_dotenv
from sqlalchemy import MetaData, create_engine, inspect, select, text

# The file we created above
from loader import load_database
# Load the .env file to access the keys
load_dotenv()

# Set up the completion and chat llm (optional, experiment with this!)
llm = AzureOpenAI(deployment_name="your_deployment_name",
model_name="text-davinci-003")
chat_llm = AzureChatOpenAI(deployment_name="gpt-35-turbo",temperature=0.1)

# Set up the chat llm
os.environ["OPENAI_API_VERSION"]=os.getenv('OPENAI_CHAT_API_VERSION')

We will use LangChain as a framework for developing applications powered by language models. LangChain is a framework that enables developers to build agents to tackle complex problems and break them into smaller sub-tasks. LangChain can introduce context and memory into completions by creating intermediate steps and chaining commands.

The reason for having two language models is related to their specific use cases. The general-purpose language model (text-davinci-003) can be used for a wide range of tasks, including ones that require contextless completions, text understanding, and more. On the other hand, the chatbot model (gpt-35-turbo) is specifically designed to handle conversational interactions, providing more natural and interactive responses.

By using two different models, you can leverage the strengths of each model for their respective tasks. The general-purpose model may be better for specific non-conversational tasks, while the chatbot model is better suited for interactive and dynamic conversations. This gives you more flexibility and control over how your application handles different types of language-processing tasks.

# Create engine and Call the function to ingest the data
engine = create_engine('sqlite:///db', echo=True)
db = load_database(engine)

# OR

# if the database exists somewhere you could do something like;
engine = create_engine("your custom URL, example - postgresql+psycopg2://
scott:tiger@localhost:5432/mydatabase")
db = load_database(engine)

Now, the function called create_engine returns the database and is stored in db. To make sure you have the right tables, you could do the following:

# Create an inspector object to inspect the database
inspector = inspect(engine)

# Get the list of table names
table_names = inspector.get_table_names()

A completion language model is created in the next lines of code using a preexisting language model llm. The LLM is designed to interact with a SQL database called db, and SqlDatabaseChain allows you to answer questions over a SQL database.

# Create SQLDatabaseChain
sql_chain = SQLDatabaseChain.from_llm(llm, db,
verbose=True, use_query_checker=True)

verbose=True: This is an optional parameter set to True, meaning the code will produce detailed output or log information to provide additional context during the execution.

If you have multiple databases, you can do the following;

# Create SQLDatabaseChain
one_sql_chain = SQLDatabaseChain.from_llm(llm, car_db,
verbose=True, use_query_checker=True)

two_sql_chain = SQLDatabaseChain.from_llm(llm, bike_db,
verbose=True, use_query_checker=True)

To create an agent, combine these tools into a suite where the agent can access and determine which ones to use based on user input. The agent has the flexibility to employ multiple tools and can take the output of one tool as the input for the next task. For further information on the capabilities of agents, please refer to the details provided here.

memory = ConversationBufferMemory(memory_key="chat_history", 
return_messages=True)

tools = [one_sql_tool, two_sql_tool]

conversational_agent = initialize_agent(
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
tools=tools,
llm=llm,
verbose=True,
max_iterations=3,
memory=memory,
dialect='ansi',
early_stopping_method="generate",
handle_parsing_errors=True,
)

Next, we will use Gradio to build the UI.

Note: Gradio may be better suited for quick prototyping chatbot applications than Streamlit due to its specific focus on simplicity and ease of use for building interactive machine learning applications, including chatbots. Gradio’s strengths lie in its straightforward API, which simplifies the integration of pretrained models and makes it easy for developers to create interactive user interfaces for chatbots without much boilerplate code.

# Define a simple query function that runs the query agent and returns the response
def query_fnr(input_text):
response = conversational_agent.run(input=input_text)
return response

# Build the UI
iface = gr.Interface(
fn=query_fn,
inputs=gr.inputs.Textbox(label="Enter your query"),
outputs=gr.outputs.Textbox(label="Query Result"),
title="Domain-specific chatbot"
)

# Launch the UI but do not share it publicly
iface.launch(share=False, server_port=8080)

Now, your chatbot is running on the local URL: http://127.0.0.1:8080. The above code defines a simple UI, using the Gradio library, that takes a user’s query as input, processes the query using the query_fn function (which in turn utilizes a conversational agent), and then displays the response from the agent as the output in a text box.

Screenshot of Gradio UI hosted locally at http://127.0.0.1:8080

Upon querying the chatbot, the terminal of app.py looks like this:

Running on local URL:  http://127.0.0.1:8080

To create a public link, set `share=True` in `launch()`.

> Entering new chain...
Thought: Do I need to use a tool? Yes
Action: DB
Action Input: "<<Question asked appears here>>"

> Entering new chain...
.
.
.

<<GENERATED SQL QUERY>>

SQLResult: [(<<QUERY RESULT>>)]
Answer: Human interpretation of <<QUERY RESULT>>

> Finished chain.

To sum up, you now possess a specialized chatbot designed for your domain-specific needs, enabling you to effortlessly retrieve the information you seek through simple queries.

Note: This solution refrains from modifying off-the-shelf models using custom data for additional improvement. However, for specific applications demanding customized responses with industry or organization-specific jargon, the models can be fine-tuned through in-context learning (using relevant context and/or examples) or few-shot learning or fine-tuning on specific parts of the model.

Additional reading

GenAI basic courses

Level of difficulty: Easy → Hard

Slalom is a global consulting firm that helps people and organizations dream bigger, move faster, and build better tomorrows for all. Learn more about Slalom’s human-centered AI approach and reach out today.

--

--