Effortless Multilingual Data Querying: A Generative AI System for BigQuery NLQs

Rakeshmohandas
19 min readNov 13, 2023

--

Create a Multilingual Generative AI system that enables you to easily pose natural language queries (NLQ) about your data stored in BigQuery, facilitating straightforward interaction and information retrieval.

In today’s data-driven world, the ability to efficiently extract meaningful insights from vast datasets is of paramount importance. BigQuery, Google’s cloud-based data warehousing and analytics platform, offers powerful capabilities for data storage and analysis. However, accessing and querying this data can often be complex, requiring knowledge of SQL and database structures.

To bridge this gap and empower users with a more intuitive and accessible approach to data querying, a cutting-edge solution has emerged from the Google’s Generative AI System like — the “Effortless Multilingual Data Querying” system. This innovative system leverages the capabilities of Generative AI to allow users to interact with their BigQuery data using natural language queries (NLQs) in multiple languages, eliminating the need for specialized technical knowledge.

Key Features:

Generative AI-Powered NLQs: The heart of this system lies in its sophisticated Generative AI algorithms like Text-Bison, Duet AI/Codey, Chirp(Text to Speech), . These algorithms understand and interpret natural language queries in multiple languages, making data querying a straightforward and user-friendly experience.

Seamless Integration with BigQuery: The system seamlessly integrates with BigQuery, offering users direct access to their data. It leverages the platform’s processing power and scalability while abstracting the complexity of SQL queries.

Multilingual Capabilities: One of the standout features of this system is its ability to support multiple languages. Whether you’re comfortable in English, Spanish, Chinese, or any other language, you can effortlessly ask questions about your BigQuery data.

Enhanced Accessibility: By eliminating the need for specialized database query languages, this system democratizes data querying, making it accessible to a broader range of users, including business analysts, data scientists, and decision-makers.

Efficiency and Productivity: With the ability to ask questions in plain language, users can rapidly retrieve insights from their data, leading to more informed decision-making and increased productivity.

from typing import Any, Mapping, List, Dict, Optional, Tuple
from io import BytesIO
import base64
import vertexai
import matplotlib.pyplot as plt
import time
from typing import Any, Mapping, List, Dict, Optional, Tuple
from pydantic import BaseModel, Extra, root_validator
from langchain.llms.base import LLM
from langchain.embeddings.base import Embeddings
from langchain.chat_models.base import BaseChatModel
from langchain.llms.utils import enforce_stop_tokens
from langchain.schema import Generation, LLMResult
from langchain.schema import AIMessage, BaseMessage, ChatGeneration, ChatResult, HumanMessage, SystemMessage
from vertexai.preview.language_models import TextGenerationModel, TextEmbeddingModel, ChatModel
from google.cloud import storage
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.font_manager
import gradio as gr
from sqlalchemy import *
from sqlalchemy.engine import create_engine
from sqlalchemy.schema import *
import pandas as pd
from pathlib import Path
from matplotlib.font_manager import FontProperties
import datetime
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sqlalchemy import create_engine, MetaData
from google.cloud import storage
import datetime

A brief explanation of this libraries (referenced from Bard) are given below

  • typing: The typing module provides type hints for Python code. This helps to improve the readability and maintainability of code by making it clear what types are expected for variables, functions, and classes. The specific types imported in the code are:
  • typing: The typing module provides type hints for Python code. This helps to improve the readability and maintainability of code by making it clear what types are expected for variables, functions, and classes. The specific types imported in the code are:
  • Any: The Any type represents any type. This is a useful type to use when you don't know the specific type of a variable or when you want to allow any type to be used.
  • Mapping: The Mapping type represents a collection of key-value pairs. This is a useful type to use for dictionaries and other types that map keys to values.
  • List: The List type represents a list of values. This is a useful type to use for storing ordered collections of data.
  • Dict: The Dict type represents a dictionary. This is a useful type to use for storing unordered collections of key-value pairs.
  • Optional: The Optional type represents a value that may be None. This is a useful type to use for variables that may not always have a value.
  • Tuple: The Tuple type represents a tuple. This is a useful type to use for storing immutable collections of data.
  • io: The io module provides support for working with streams of data. The specific type imported in the code is:
  • BytesIO: The BytesIO class represents a stream of bytes in memory. This is a useful type to use for storing binary data.
  • base64: The base64 module provides support for encoding and decoding data using the Base64 encoding scheme. This is a useful module to use for storing and transmitting binary data in a text format.
  • vertexai: The vertexai module provides a Python client library for Vertex AI. This is a useful module to use for deploying and managing machine learning models on Vertex AI. The specific types imported in the code are:
  • TextGenerationModel: The TextGenerationModel class represents a text generation model deployed on Vertex AI.
  • TextEmbeddingModel: The TextEmbeddingModel class represents a text embedding model deployed on Vertex AI.
  • ChatModel: The ChatModel class represents a chat model deployed on Vertex AI.
  • matplotlib.pyplot: The matplotlib.pyplot module provides a plotting library for Python. This is a useful module to use for creating visualizations of data.
  • time: The time module provides support for working with time and dates.
  • pydantic: The pydantic module provides a data validation library for Python. This is a useful module to use for ensuring that data is valid before it is used. The specific types imported in the code are:
  • BaseModel: The BaseModel class is a base class for creating data models.
  • Extra: The Extra class is used to control how extra data is handled when validating data models.
  • root_validator: The root_validator decorator is used to define custom validation logic for data models.
  • langchain.llms.base: The langchain.llms.base module provides a base class for large language models (LLMs), this base class would define common properties, methods, and interfaces that are applicable to all types of LLMs, regardless of their specific implementation or architecture.

Purpose of langchain.llms.base: This base class could provide a unified interface for interacting with different LLMs. This means that regardless of the underlying model (like GPT, BERT, etc.), the user can interact with them using a common set of commands or methods.

It might include methods for loading models, preprocessing input data, generating predictions or outputs, and other shared functionalities that are relevant to LLMs.

The base class could also handle common tasks such as error handling, logging, and maintaining performance metrics, which are essential for managing LLMs in production environments.

  • langchain.embeddings.base: The langchain.embeddings.base module provides a base class for text embedding models.

The base class might include methods for initializing embedding models, processing text to generate embeddings, and possibly utilities for handling these embeddings (like normalization, dimensionality reduction, etc.).

It can also provide abstract methods that must be implemented by derived classes, which would be specific to different embedding techniques (like word2vec, GloVe, BERT embeddings, etc.).

  • langchain.chat_models.base: The langchain.chat_models.base module provides a base class for chat models.

The role of this base class is to establish a foundational structure for various chat models, which are typically used in developing conversational AI systems, such as chatbots or virtual assistants.

The base class might include methods for initializing chat models, processing conversational input, generating responses, maintaining context or dialogue state, and handling different types of conversational interactions.

It can provide abstract methods that need to be implemented specifically for each chat model type, catering to different conversational AI architectures (like rule-based systems, retrieval-based models, generative models, etc.).

  • langchain.llms.utils: The langchain.llms.utils module provides utility functions for working with LLMs. The specific function imported in the code is:

The langchain.llms.utils module, is a utility module specifically designed to support operations and tasks related to Large Language Models (LLMs). Utility modules typically contain a collection of functions that are helpful in carrying out common, often repeated tasks in a more efficient, streamlined manner. These functions usually serve to simplify or abstract away the more complex aspects of working with specific technologies or systems—in this case, LLMs.

In a codebase, these utility functions would typically be used to streamline the process of implementing and managing tasks related to LLMs. By using these utilities, developers can focus more on the higher-level aspects of their application or research, rather than getting bogged down in the intricate details of working directly with the LLMs.

  • enforce_stop_tokens: The enforce_stop_tokens function is used to ensure that a generated text sequence ends with a stop token.
  • langchain.schema: The langchain.schema module provides data structures for representing generated text and LLM results. The specific types imported in the code are:
  • Generation: The Generation class represents a generated text sequence.
  • LLMResult: The LLMResult class represents the result of calling an LLM.
  • langchain.schema: The langchain.schema module provides data structures for representing chat messages. The specific types imported in the code are:
  • AIMessage: The AIMessage class represents a message generated by an AI assistant.
  • BaseMessage: The BaseMessage class is a base class for chat messages.
  • ChatGeneration: The ChatGeneration class represents a generated chat message.

Next we will write the code to create a comprehensive system that can handle natural language queries, generate and execute Bigquery SQL queries, interact with chat and text generation models from Vertex AI, and visualize the query results. While It’s a complex integration of several advanced technologies, including AI, DWH, and data visualization, its quite simple to develop.

1. Rate Limiting Function

def rate_limit(max_per_minute):
period = 60 / max_per_minute
while True:
before = time.time()
yield
after = time.time()
elapsed = after - before
sleep_time = max(0, period - elapsed)
if sleep_time > 0:
print(f'Sleeping {sleep_time:.1f} seconds')
time.sleep(sleep_time)def rate_limit(max_per_minute):
# ...

This function creates a generator to enforce rate limiting, ensuring that a certain number of requests per minute (specified by max_per_minute) are not exceeded. It's useful for interacting with APIs that have rate limits.

2. Vertex AI Model Classes

Several classes are defined to interact with Vertex AI’s models:

  • _VertexCommon: A base class providing common properties for Vertex AI models.
  • VertexLLM: A class for text generation using Vertex AI's LLM.
  • _VertexChatCommon and VertexChat: Classes for handling chat-based interactions with Vertex AI's models.
  • VertexMultiTurnChat: A class for managing multi-turn chat sessions.
  • VertexEmbeddings: A class for generating text embeddings using Vertex AI's API.

3. SQL and BigQuery Integration

The code integrates SQL and BigQuery functionalities, allowing SQL queries to be generated and executed based on user input. This is achieved through a combination of SQLAlchemy for database connections and custom functions for generating and executing queries.

class _VertexCommon(BaseModel):

client: Any = None #: :meta private:
model_name: str = "text-bison-32k"
temperature: float = 0.2
top_p: int = 0.8
top_k: int = 40
max_output_tokens: int = 1024

@property
def _default_params(self) -> Mapping[str, Any]:
return {
"temperature": self.temperature,
"top_p": self.top_p,
"top_k": self.top_k,
"max_output_tokens": self.max_output_tokens
}

def _predict(self, prompt: str, stop: Optional[List[str]]) -> str:
res = self.client.predict(prompt, **self._default_params)
return self._enforce_stop_words(res.text, stop)

def _enforce_stop_words(self, text: str, stop: Optional[List[str]]) -> str:
if stop:
return enforce_stop_tokens(text, stop)
return text

@property
def _llm_type(self) -> str:
return "vertex_ai"


class VertexLLM(_VertexCommon, LLM):
model_name: str = "text-bison-32k"

@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that the python package exists in environment."""
try:
values["client"] = TextGenerationModel.from_pretrained(values["model_name"])
except AttributeError:
raise ValueError("Could not set Vertex Text Model client.")

return values

def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
"""Call out to Vertex AI's create endpoint.

Args:
prompt: The prompt to pass into the model.

Returns:
The string generated by the model.
"""
return self._predict(prompt, stop)


class _VertexChatCommon(_VertexCommon):
"""Wrapper around Vertex AI Chat large language models.
"""

model_name: str = "chat-bison-32k" #: Model name to use.

@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that the python package exists in environment.
"""
try:
values["client"] = ChatModel.from_pretrained(values["model_name"])
except AttributeError:
raise ValueError("Could not set Vertex Text Model client.")

return values

def _response_to_chat_results(
self, response: object, stop: Optional[List[str]]
) -> ChatResult:
text = self._enforce_stop_words(response.text, stop)
return ChatResult(generations=[ChatGeneration(message=AIMessage(content=text))])


class VertexChat(_VertexChatCommon, BaseChatModel):
"""Wrapper around Vertex AI large language models.
"""

def _generate(
self, messages: List[BaseMessage], stop: Optional[List[str]] = None
) -> ChatResult:
chat, prompt = self._start_chat(messages)
response = chat.send_message(prompt)
return self._response_to_chat_results(response, stop=stop)

def _start_chat(
self, messages: List[BaseMessage]
) -> Tuple[object, str]:
"""Start a chat.
Args:
messages: a list of BaseMessage.
Returns:
a tuple that has a Vertex AI chat model initializes, and a prompt to send to the model.
Currently it expects either one HumanMessage, or two message (SystemMessage and HumanMessage).
If two messages are provided, the first one would be use for context.
"""
if len(messages) == 1:
message = messages[0]
if not isinstance(message, HumanMessage):
raise ValueError("Message should be from a human if it's the first one.")
context, prompt = None, message.content
elif len(messages) == 2:
first_message, second_message = messages[0], messages[1]
if not isinstance(first_message, SystemMessage):
raise ValueError(
"The first message should be a system if there're two of them."
)
if not isinstance(second_message, HumanMessage):
raise ValueError("The second message should be from a human.")
context, prompt = first_message.content, second_message.content
else:
raise ValueError(f"Chat model expects only one or two messages. Received {len(messages)}")
chat = self.client.start_chat(context=context, **self._default_params)
return chat, prompt

async def _agenerate(
self, messages: List[BaseMessage], stop: Optional[List[str]] = None
) -> ChatResult:
raise NotImplementedError(
"""Vertex AI doesn't support async requests at the moment."""
)


class VertexMultiTurnChat(_VertexChatCommon, BaseChatModel):
"""Wrapper around Vertex AI large language models.
"""

chat: Optional[object] = None

def clear_chat(self) -> None:
self.chat = None

def start_chat(self, message: Optional[SystemMessage] = None) -> None:
if self.chat:
raise ValueError("Chat has already been started. Please, clear it first.")
if message and not isinstance(message, SystemMessage):
raise ValueError("Context should be a system message")
context = message.content if message else None
self.chat = self.client.start_chat(context=context, **self._default_params)

@property
def history(self) -> List[Tuple[str]]:
"""Chat history."""
if self.chat:
return self.chat._history
return []

def _generate(
self, messages: List[BaseMessage], stop: Optional[List[str]] = None
) -> ChatResult:
if len(messages) != 1:
raise ValueError(
"You should send exactly one message to the chat each turn."
)
if not self.chat:
raise ValueError("You should start_chat first!")
response = self.chat.send_message(messages[0].content)
return self._response_to_chat_results(response, stop=stop)

async def _agenerate(
self, messages: List[BaseMessage], stop: Optional[List[str]] = None
) -> ChatResult:
raise NotImplementedError(
"""Vertex AI doesn't support async requests at the moment."""
)


class VertexEmbeddings(Embeddings, BaseModel):
"""Wrapper around Vertex AI large language models embeddings API.
"""

model_name: str = "textembedding-gecko@001" #: Model name to use.
model: Any
requests_per_minute: int = 15

@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that the python package exists in environment.
"""
try:
values["model"] = TextEmbeddingModel
except AttributeError:
raise ValueError("Could not set Vertex Text Model client.")

return values

class Config:
"""Configuration for this pydantic object."""

extra = Extra.forbid

def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Call Vertex LLM embedding endpoint for embedding docs
Args:
texts: The list of texts to embed.
Returns:
List of embeddings, one for each text.
"""
self.model = self.model.from_pretrained(self.model_name)

limiter = rate_limit(self.requests_per_minute)
results = []
docs = list(texts)

while docs:
# Working in batches of 2 because the API apparently won't let
# us send more than 2 documents per request to get embeddings.
head, docs = docs[:2], docs[2:]
# print(f'Sending embedding request for: {head!r}')
chunk = self.model.get_embeddings(head)
results.extend(chunk)
next(limiter)

return [r.values for r in results]

def embed_query(self, text: str) -> List[float]:
"""Call Vertex LLM embedding endpoint for embedding query text.
Args:
text: The text to embed.
Returns:
Embedding for the text.
"""
single_result = self.embed_documents([text])
return single_result[0]

# Initialize the ChatModel for chat interactions
chat_model = ChatModel.from_pretrained("chat-bison-32k")
chat_parameters = {
"temperature": 0.2,
"max_output_tokens": 1024,
"top_p": 0.8,
"top_k": 40
}
chat = chat_model.start_chat()

# Initialize the VertexLLM for text generation
llm = VertexLLM(
model_name='text-bison-32k',
max_output_tokens=256,
temperature=0,
top_p=0.8, top_k=40,
verbose=True,
)

4. Data Visualization

A function, generate_visualization_and_upload, is provided to create visualizations from query results and upload them to Google Cloud Storage. It uses matplotlib and seaborn for plotting.

5. Chat and LLM Initialization

Instances of ChatModel and VertexLLM are created, configured with specific parameters like temperature, top_p, top_k, and maximum output tokens.

6. SQL Chain Setup

The code sets up an SQL chain using langchain to facilitate the generation of SQL queries from natural language input. This involves defining a prompt template for query generation and processing the output to execute and visualize the results.

7. Visualization Logic

There’s logic to handle different scenarios based on the number of columns in the query results, determining the type of visualization to create.

The Below code snippet sets up the infrastructure to integrate an LLM with an SQL database, allowing the LLM to generate SQL queries based on natural language input. The SQLQueryResult class is a structured way to store and handle the results of these SQL queries. This setup would be particularly useful in applications where users interact with a database using conversational language, and the system translates these interactions into SQL queries to retrieve and manipulate data.

# Define the SQLQueryResult class to hold the SQL query result
class SQLQueryResult(BaseModel):
data: List[Dict[str, Any]] = []

def append(self, item: Dict[str, Any]) -> None:
self.data.append(item)

def bq_qna(project_id, dataset_id, table_names_options, question, chart_type, language):

# Convert the table_names tuple to a list
table_names_list = list(table_names_options)

# Print the value of table_names for debugging
print("Table Names List:", table_names_list)


# Create SQL engine for BigQuery
engine = create_engine(f"bigquery://{project_id}/{dataset_id}")

# Create SQLDatabase instance from BQ engine
db = SQLDatabase(engine=engine, metadata=MetaData(bind=engine), include_tables=table_names_list)
vertexai.init(project=project_id, location=location)

# SQL Chain setup for LLM
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True, return_intermediate_steps=True)



# Testing all tables
query = f"""SELECT * FROM {project_id}.{dataset_id}.{table_names_list[0]}"""
# Similar queries for other tables in table_names_list (if needed)



# Define prompt for BigQuery SQL
_googlesql_prompt = """Firstly, Any Questions Asked needs to be first Translated to English.
Then pass the translated text to generate syntactically correct GoogleSQL query to run Query, believe that You are a Google BigQuery SQL expert.
As a GoogleSQL expert, formulate accurate and efficient SQL queries for each input question.
1. Use LIMIT clause for a maximum of {top_k} results, unless specified otherwise.
2. Selectively query relevant columns, using backticks (`) for column names.

Guidelines:
Data Selection:
Only query necessary columns from the provided tables in {table_info}.
Ensure column existence and correct table-column association.
You must query only the columns that are needed to answer the question.
Special Cases:
For STRING columns needing aggregation, CAST them as NUMERIC.
For month-specific queries, use a date range covering the entire month.
For column name requests:
SELECT column_name
FROM `{project_id}.{dataset_id}`.INFORMATION_SCHEMA.COLUMNS
WHERE table_name in {table_info}

Be careful to not query for columns that do not exist. Also, pay attention to which column is in which table.
Use the following format:
Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"
Only use the following tables:
{table_info}

If someone asks for column names in the table, use the following format:

Question: {input}"""

Once you use this prompt, you can then convert natural language questions into SQL queries, execute those queries on a BigQuery database, and process the results.

GOOGLESQL_PROMPT = PromptTemplate(
input_variables=["input", "table_info", "top_k", "project_id", "dataset_id"],
template=_googlesql_prompt,
)

# Pass the question to the prompt template
final_prompt = GOOGLESQL_PROMPT.format(input=question, project_id=project_id, dataset_id=dataset_id, table_info=table_names_list, top_k=10000)

# Pass the final prompt to SQL Chain
output = db_chain(final_prompt)


# Using the constructed query, fetch the results
query_results = engine.execute(output['intermediate_steps'][1]).fetchall()

# Check if query_results is empty
if not query_results:
raise ValueError("The SQL query returned no results.")

# Convert query_results to a DataFrame
query_result = pd.DataFrame(query_results, columns=query_results[0].keys())

you can add any code in between to auto generate Graph, or visualisation. The End results of all of this would look something like this once you create a gradio ui, The code snippet below would help in creating a graphical user interface (GUI) using gr.Interface, a function from the gradio library. gradio is often used to create simple, yet effective, web interfaces for machine learning models and other data processing functions.

gr.Interface(
fn=bq_qna,
inputs=inputs,
outputs=outputs,
live=False,
include_footer=False,
title="<center><h1>Ask BigQuery with Automated Chart<h1><center>",
description="<center><i>Demo to Ask Questions to Bigquery Tables with Automated Chart Visualization</i></center>",
article="By Rakesh Mohandas, CE-Data Analytics",
examples=[
["erazuthmohandasrakesh-emr", "google_trends", ["international_top_rising_terms"], "Top 5 days when the term isro trended based on average scores.", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "stock", ["nifty"], "Top 10 stocks Symbol by average previous close in 2017 ", "barplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "stock", ["nifty"], "2017 में सबसे बड़ा 5 स्टॉक सिंबल एवरेज प्रीवियस क्लोज के हिसाब से कौनसा है", "barplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "stock", ["nifty"], "2017 mein sabse bada 5 Stock Symbol average previous close ke hisaab se kaunsa hai", "barplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "stock", ["nifty"], "சராசரி முந்தைய முடிவின்படி 2017 இல் மிகப்பெரிய 5 பங்குச் சின்னங்கள் யாவை? ", "barplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["election"], "Top 5 candidates name with the most votes from the Bharatiya Janata Party in Assam.", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["election"], "asam mein Bharathija janata party se sabase adhik vot paane vaale sheersh 5 ummeedavaaron ke naam.", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["election"], "असम में भारतीय जनता पार्टी से सबसे अधिक वोट पाने वाले शीर्ष 5 उम्मीदवारों के नाम।", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["election"], "Названы топ-5 кандидатов, набравших наибольшее количество голосов от партии Бхаратия Джаната в Ассаме.", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["Transaction"], "what is the top 10 product number by revenue in 2015?", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["customerdata", "campaignnba", "Transaction"], "2015 में राजस्व के हिसाब से शीर्ष 5 उत्पाद श्रृंखला कौन सी है?", "scatterplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "Banking", ["customerdata", "Transaction"], "2015 mein raajasv ke hisaab se sheersh 10 saimasang graahak kaun se hain?", "lineplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_field_level_service"], "what are the top 5 equipment IDs whose Average days in service is greater than 100 days?", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings:"],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_field_level_service"], "शीर्ष 5 उपकरण आईडी कौन से हैं जिनकी सेवा के औसत दिन 100 दिनों से अधिक हैं?", "scatterplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_field_level_service"], "sheersh 5 upakaran aaeedee kaun se hain jinakee seva ke ausat din 100 dinon se adhik hain?", "lineplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_production_line"], "what is the Average operating voltage of top 5 Failure Equipment Type?", "barplot","English", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_warranty"], "Show me the average claim cost by Failure Equipment type VOLTAGE_REGULATOR and WIRING_HARNESS ", "scatterplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "],
["erazuthmohandasrakesh-emr", "predictive_maintenance", ["pm_field_level_service", "pm_production_line", "pm_warranty"], "शीर्ष 5 विफलता उपकरण प्रकार का औसत ऑपरेटिंग वोल्टेज क्या है?", "lineplot","Hindi", "I would like you to analyze the tabular data passed to you and provide the best interpretation. Please consider the following questions:\n\nWhat are the most important features of the data?\nHow are the features related to each other?\nWhat are the key takeaways from the data?\nPlease provide your analysis in a clear and concise manner, and highlight any important findings: "]

],
).launch()

In conclusion, the “Effortless Multilingual Data Querying” system represents a significant advancement in the field of Analytics. By harnessing the power of Generative AI and multilingual capabilities, it empowers users to interact with their BigQuery data effortlessly, unlocking valuable insights and driving data-driven decision-making across diverse language backgrounds. This system is poised to revolutionize the way organizations leverage their data assets, fostering a more data-inclusive and accessible future.

--

--