ai_query on DBSQL — Enhancing your data analysis with GenAI

Databricks SQL SME
DBSQL SME Engineering
4 min readFeb 9, 2024
Using AI-functions in Databricks SQL

Authors: Maggie Wang, Sr Software Engineer at Databricks & Tim Lortz, Lead Specialist Solutions Architect at Databricks

Intro

At Databricks, a common question we hear from customers is “How can we enable our SQL analysts to leverage the power of Large Language Models (LLMs) in their workflows?” Use cases we’ve seen from our customers include classification or summarization of survey and customer feedback data, creating or enriching marketing content, and even generating new SQL code.

OpenAI’s GPT-3.5 and GPT-4 LLMs quickly became popular and effective entry points for organizations wishing to integrate LLMs with SQL workflows. More recently, other commercial model providers and open-source models have emerged that can also provide sufficient model quality for many use cases. Our customer base has made it clear that they’d like to have access to a wide variety of LLMs within Databricks SQL, not just with LLMs provided by OpenAI or Azure OpenAI. They might want to do this for any of a variety of motivations:

  • Ensure privacy and security of sensitive data
  • Reduce LLM vendor lock-in
  • Gain access to emerging highly capable open-source or commercial LLMs
  • Decrease inference costs by serving smaller LLMs
  • Increase quality by using LLMs custom-built for the task at hand

Enter ai_query…

ai_query (docs: AWS | Azure) enables invocation within Databricks SQL of:

  • LLMs in the Databricks Foundation Model API (docs: AWS | Azure). These models include off-the-shelf or custom fine-tuned variants of popular open-source LLMs; currently, Llama 2 70B Chat, Mixtral-8x7B Instruct, MPT 7B Instruct and MPT 30B Instruct
  • External Models such as OpenAI, Anthropic or PaLM
  • Custom models hosted by a model serving endpoint (LLM or otherwise)

In this blog post, we’ll highlight the use of leading open-source models in the Databricks Foundation Models API in support of several common use cases within Databricks SQL.

Dynamic content creator: retail marketing

With the power of LLM, you can generate creatives for your data in any writing style. Create fun holiday vibes for your seasonal promotion.

SELECT
sku_id,
product_name,
ai_query(
"databricks-mixtral-8x7b-instruct",
"You are a marketing expert for a Valentine's day marketing event targeting Gen-Z. Generate a promotional tweet in 20 words for product: " || product_name
) AS promo_message
FROM
retail_products
Example SQL Output

Voice of the customer: detecting sentiment

LLMs can help you analyze and categorize long texts at scale; for example, determining customers’ sentiment toward your products or services.

SELECT
review,
ai_query(
"databricks-llama-2-70b-chat",
"You are provided a customer review in backticks. Classify the customer review into one of the two sentiment categories: Positive, Negative. Review: `" || review || "`"
) AS sentiment
FROM
customer_reviews
Example SQL Output

Developer superpower: automating system log analysis

LLMs can also help accelerate triage and analysis of system logs, leading to more automation and enabling more engineers and analysts to make sense of their systems.

SELECT
raw_log,
ai_query(
'databricks-mixtral-8x7b-instruct',
"You will be provided with a log message. Please extract the log template and variables from this log message: " || raw_log
)
as log_template,

ai_query(
'databricks-mixtral-8x7b-instruct',
"Summarize the following log message in 30 words or less: " || raw_log
)
) as log_summary

FROM
system_logs
Example SQL Output

Conclusion

ai_query provides a powerful and convenient entry point to prompt engineering LLMs with your domain-specific data through Databricks SQL. LLMs have the ability to apply both automation and creativity across a wide range of language-related tasks. We have shown several examples, but the possibilities are endless. Stay tuned for more AI-related functions in Databricks SQL, and let us know how you plan to use ai_query in your workloads!

--

--

Databricks SQL SME
DBSQL SME Engineering

One stop shop for all technical how-tos, demos, and best practices for building on Databricks SQL