Building a Conversational MySQL Chatbot with Langchain agents and OpenAI GPT

4 min readJun 18, 2024

In this blog post, we’ll explore how to create an intelligent SQL chatbot using LangChain and OpenAI’s powerful models. This chatbot not only handles general conversations but also executes SQL queries, making it an invaluable tool for database management.

Introduction
With the advancement of AI, integrating language models with database operations has opened new possibilities for enhancing user interactions. Our goal is to create a chatbot that seamlessly interacts with a SQL database, responding to user queries with accurate data retrieval and engaging conversation.

Prerequisites
Ensure you have the following:

Python installed on your system.
Access to a MySQL database with pre-loaded data.
An OpenAI API key.

Libraries and Environment Setup
Importing necessary libraries including openai, pandas, mysql.connector, and sqlalchemy.

! pip install langchain 
! pip install sqlalchemy
! pip install openai
! pip install langchain_community
! pip install langchain_experimental

import openai
import os
import pandas as pd
import mysql.connector
from sqlalchemy import create_engine
import warnings
warnings.filterwarnings("ignore")

os.environ['OPENAI_API_KEY'] = "your-api-key"
openai.api_key = os.environ['OPENAI_API_KEY']

Connecting to MySQL Database:

Establish a connection to the MySQL database using SQLAlchemy. This approach simplifies executing SQL queries directly from Python.

username = 'root'
password = 'your-password'
host = '127.0.0.1:3306'
database = 'database_name'

engine = create_engine(f'mysql+mysqlconnector://{username}:{password}@{host}/{database}'

Creating Custom LangChain Tools
Defining Schema and Tools
We define two primary tools: SQLQueryTool and ConversationalTool.

Custom LangChain Tools Explanation

Before diving into the custom tools, it’s crucial to understand the structure that these tools will operate on. This structure is defined using Pydantic models, which enforce type checking and data validation.

class SQLQueryInput(BaseModel):
    query: str

Here, SQLQueryInput is a simple class that inherits from BaseModel, a base class provided by Pydantic. This model defines a single field, query, which is of type str. It represents the SQL query that the user will input. This model is utilized to ensure that the input received matches the expected format, enhancing the robustness and reliability of the chatbot.

SQLQueryTool: A Tool for SQL Database Queries
The SQLQueryTool class is designed to handle SQL queries specifically. It’s one of the primary components of our chatbot that interacts directly with the database.

class SQLQueryTool(BaseTool):
    name = "SQLQueryTool"
    description = "Tool to query a SQL database"
    args_schema = SQLQueryInput
    chain: SQLDatabaseChain = Field(exclude=True)

    def _run(self, query: str) -> str:
        return self.chain.run(query)

ConversationalTool: A Tool for General Conversations
This tool is geared towards handling general conversational queries that may not necessarily involve direct database interaction.

class ConversationalTool(BaseTool):
    name = "ConversationalTool"
    description = "Tool for general conversation"
    args_schema = SQLQueryInput
    llm: ChatOpenAI = Field(exclude=True)

    def _run(self, query: str) -> str:
        return self.llm(query)

Initializing Conversational chain

llm = ChatOpenAI(model="gpt-3.5-turbo")

db = SQLDatabase.from_uri("mysql+pymysql://username:password@127.0.0.1:3306/database_name")
sql_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

sql_tool = SQLQueryTool(chain=sql_chain)
conversational_tool = ConversationalTool(llm=llm)

tools = [sql_tool, conversational_tool]

memory = ConversationBufferMemory(memory_key="chat_history")

conversational_agent = initialize_agent(
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    memory=memory,
    handle_parsing_errors=True  
)

Initializing the chat loop

while True:
    query = input("Enter your query (type 'exit' to stop): ")
    if query.lower() == 'exit':
        break
    result = conversational_agent(query)
    print(result)

To load the conversations in the memory:

memory.load_memory_variables({})

You can see the previous conversations as:

{'chat_history': "Human: Hi, My name is rasika\nAI: Hi, Rasika! How can I assist you today?\nHuman: What's the highest sales in the product sales table?\nAI: The highest sales in the product sales table is 22638.48. How can I assist you further?\nHuman: What's the avg sales in product sales table?\nAI: The average sales in the product sales table is 228.23. How can I assist you further?\nHuman: What's the difference between this two?\nAI: The difference between the highest sales and the average sales in the product sales table is approximately $22,410.25. Is there anything else you would like to know or discuss?\nHuman: Bye\nAI: Goodbye! Feel free to come back if you have any more questions in the future. Have a great day!"}

Putting it altogether:

import openai
import os
os.environ['OPENAI_API_KEY'] = "OPENAI_API_KEY"
openai.api_key = os.environ['OPENAI_API_KEY']

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine
from langchain import OpenAI, SQLDatabase
import warnings
warnings.filterwarnings("ignore")

# MySQL Database Chain 
username = 'username'
password = 'password'
host = '127.0.0.1:3306'
database = 'database_name'

engine = create_engine(f'mysql+mysqlconnector://{username}:{password}@{host}/{database}')

from langchain.agents import initialize_agent, AgentType
from langchain.chat_models import ChatOpenAI
from langchain.utilities import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
from langchain.memory import ConversationBufferMemory
from langchain.tools import BaseTool
from pydantic import BaseModel, Field

class SQLQueryInput(BaseModel):
    query: str

class SQLQueryTool(BaseTool):
    name = "SQLQueryTool"
    description = "Tool to query a SQL database"
    args_schema = SQLQueryInput
    chain: SQLDatabaseChain = Field(exclude=True)

    def _run(self, query: str) -> str:
        return self.chain.run(query)

class ConversationalTool(BaseTool):
    name = "ConversationalTool"
    description = "Tool for general conversation"
    args_schema = SQLQueryInput
    llm: ChatOpenAI = Field(exclude=True)

    def _run(self, query: str) -> str:
        return self.llm(query)

llm = ChatOpenAI(model="gpt-3.5-turbo")

db = SQLDatabase.from_uri("mysql+pymysql://usrename:password@127.0.0.1:3306/database_name")
sql_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)

sql_tool = SQLQueryTool(chain=sql_chain)
conversational_tool = ConversationalTool(llm=llm)

tools = [sql_tool, conversational_tool]

memory = ConversationBufferMemory(memory_key="chat_history")

conversational_agent = initialize_agent(
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    memory=memory,
    handle_parsing_errors=True  
)

while True:
    query = input("Enter your query (type 'exit' to stop): ")
    if query.lower() == 'exit':
        break
    result = conversational_agent(query)
    print(result)

memory.load_memory_variables({})

Conclusion
This Python script exemplifies how to build a versatile SQL chatbot using modern AI technologies. Such integrations are pivotal for developing intuitive interfaces for database management, making data retrieval as simple as having a conversation.

I hope it helps!

Building a Conversational MySQL Chatbot with Langchain agents and OpenAI GPT

Written by Rasika Botwe