Talk with Your Data using Gemini

Vaibhav Malpani
Google Cloud - Community
3 min readMay 6, 2024

We all have been in a situation with where you want to fetch some data from database (SQL or NoSQL) and display it on a Webpage or Mobile App.

What would a developer need to do in such scenario?
1. Get the requirement of what data is required by the stakeholder

2. Create query based on the requirement

3. Execute the query on database

4. Give back the response.

Now imagine the developer has to do this for 20–30 kind of scenarios. It would be just too much work. How can we use AI to solve this??

Talking is the most natural user interface, allowing people to interact with machines, computers, and other devices in a seamless way.

Problem Statement:

Get necessary data from publicly available sales data from database “bigquery-public-data.iowa_liquor_sales”

Solution:

We can get the scenario from user, based on that we can create the database query (SQL or NoSQL), hit the database with the query, get the data from database and give back the data in response.

Step 1: Install and Import necessary libraries and instantiate Gemini model

pip install google-generativeai

pip install pandas-gbq

pip install flask

Please replace API_KEY with your own API KEY

import google.generativeai as genai
import pandas_gbq as pd
genai.configure(api_key=API_KEY)
model = genai.GenerativeModel('gemini-1.5-pro-latest')

Step 2: Generate query based on prompt. Make sure to give a very descriptive context which will help the model build query accurately.

def generate_query(prompt):
context = """
you are suppose to write a SQL query.
You have a database of liquor sale.
Database Name: bigquery-public-data.iowa_liquor_sales
Table Name: sales
While mentioning FROM <TABLE_NAME> always use <Database Name>.<Table Name>
use the exact column names from the table content provided below.
Do not add any column names on your own.. be very precise with you answer
Table Content:
[{
"date": "2023-03-17",
"store_name": "FAREWAY STORES #648 / OTTUMWA",
"address": "1325 ALBIA RD",
"city": "OTTUMWA",
"zip_code": "52501.0",
"item_description": "JAGERMEISTER LIQUEUR MINI MEISTERS",
"pack": "12",
"bottle_volume_ml": "20",
"state_bottle_cost": "5.63",
"state_bottle_retail": "8.45",
"bottles_sold": "1",
"sale_dollars": "8.45",
"volume_sold_liters": "0.02"
}, {
"date": "2021-02-25",
"store_name": "FAREWAY STORES #995 / PELLA",
"address": "2010 WASHINGTON ST",
"city": "PELLA",
"zip_code": "50219.0",
"item_description": "THE BITTER TRUTH COCKTAIL BAR PACK",
"pack": "12",
"bottle_volume_ml": "20",
"state_bottle_cost": "9.5",
"state_bottle_retail": "14.25",
"bottles_sold": "2",
"sale_dollars": "28.5",
"volume_sold_liters": "0.04"
}]
get response as a raw sql query
"""

prompt = context + prompt
response = model.generate_content(prompt)
query = response.text.replace("sql", "").replace("\n", " ").replace("```", "").replace(" ", " ")
return query

Step 3: Query bigquery and get the response in dataframe format.

df = pd.read_gbq(query)
df.head()

Step 4: Reformat dataframe as required.

df.to_html() # to convert data into html format
df.to_csv() # to convert data into csv format

or send it in a Flask app as below:

@app.route('/get_data', methods=['POST'])
def get_data():
prompt = request.json['prompt']
query = generate_query(prompt)
df = pd.read_gbq(query)
return df.to_html()

Complete code can be found at below github repository.

Conclusion:

  1. We can easily extract information from data stored in BigQuery by naturally talking to it.
  2. This is not restricted to just BigQuery, this can be modified to get data from any datasource. (I’ve tried with SQL and NoSQL for now)

If you enjoyed this post, give it a clap! 👏 👏

Interested in similar content? Follow me on Medium, Twitter, LinkedIn for more!

--

--

Vaibhav Malpani
Google Cloud - Community

Google Developer Expert for Google Cloud. Python Developer. Cloud Evangelist.