Interacting With SQL Database Using Langchain’s SQLChain
Let’s talk about ways Q&A chain can work on SQL database. This system will allow us to ask a question about the data in an SQL database and get back a natural language answer.
At a high-level, the steps of any SQL chain and agent are:
- Convert question to SQL query: Model converts user input to a SQL query.
- Execute SQL query: Execute the SQL query.
- Answer the question: Model responds to user input using the query results.
For this blog we will focus on SQL Chain. Let’s first create a database using SQLite and we will use this database for further explanation. Below is the snapshot of the table used.
Below is the code to convert csv file->dataframe->sql database
import numpy as np
import pandas as pd
import sqlite3
df = pd.read_csv('/content/income (1).csv')
df.rename(columns={
'Name':'name',
'Age':'age',
'Income($)':'income'
})
#Function to read sql query
def read_sql_query(sql, db):
conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute(sql)
rows = cur.fetchall()
for row in rows:
print(row)
conn.close()
read_sql_query('SELECT * FROM income_details LIMIT 10;',
"income.sqlite")
Now let’s get required packages and set environment variables:
%pip install --upgrade --quiet langchain langchain-community langchain-openai
import getpass
import os
os.environ["OPENAI_API_KEY"] = 'OPENAI_API_KEY'
Now, income_details.db
is in our directory and we can interface with it using the SQLAlchemy-driven SQLDatabase
class:
from langchain_community.utilities import SQLDatabase
db = SQLDatabase.from_uri("sqlite:///income.sqlite")
#content/income.sqlite
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM income_details;")
Let’s create a simple chain that takes a question, turns it into a SQL query, executes the query, and uses the result to answer the original question
- Convert question into a simple query
from langchain.chains import create_sql_query_chain
#This chain generates SQL queries for the given database
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
chain = create_sql_query_chain(llm, db)
response = chain.invoke({"question": "How many employees are there"})
response
2. Execute SQL query
Now that we’ve generated a SQL query, we’ll want to execute it. We can use the QuerySQLDatabaseTool to easily add query execution to our chain
from langchain_community.tools.sql_database.tool import QuerySQLDataBaseTool
execute_query = QuerySQLDataBaseTool(db=db)
write_query = create_sql_query_chain(llm, db)
chain = write_query | execute_query
chain.invoke({"question": "How many employees are there"})
3. Answer the question
Now that we’ve got a way to automatically generate and execute queries, we just need to combine the original question and SQL query result to generate a final answer. We can do this by passing question and result to the LLM once more
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser
#OutputParser that parses LLMResult into the top likely string.
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
answer_prompt = PromptTemplate.from_template(
"""Given the following user question, corresponding SQL query, and SQL result, answer the user question in a funny way.
Question: {question}
SQL Query: {query}
SQL Result: {result}
Answer: """
)
answer = answer_prompt | llm | StrOutputParser()
chain = (
RunnablePassthrough.assign(query=write_query).assign(
result=itemgetter("query") | execute_query
)
| answer
)
chain.invoke({"question": "Name top 3 youngest employess with salary more than $35,000"})
O/P: Well, it looks like Tom, Rob, and Arnold are the youngest employees making more than $35,000. Looks like they’re young, wild, and making money moves!
You can get the csv file and complete code in the link below: