Unleashing the Power of Generative AI in BigQuery

Published in

Google Cloud - Community

10 min readOct 14, 2023

Generative AI is rapidly gaining popularity among enterprises today with the potential to transform the ways businesses operate and compete. Google Cloud offers a comprehensive suite of generative AI services and tools not only via the Vertex AI ML platform, but also across many product services. The great news is, Google Cloud is bringing this innovative technology directly into data engineering toolsets like BigQuery, allowing innovation to take place centrally and more efficiently. In this article, I will show you 4 ways to harness the power of generative AI on analytics workloads in BigQuery.

BigQuery is a fully managed and serverless enterprise data warehouse that enables scalable analysis over petabytes of data. BigQuery has the ML module (BQML) that can be used to build and deploy machine learning models.

Disclaimer: I work as a Data Analytics practice lead in Google Cloud, This article is my own opinion and does not reflect the views of my employer.

1. BQML integration with generative AI

BQML allows users to execute predictive analytics without having to deal with specialized ML models. BQML now has integration with the PaLM Large Language Model (LLM text-bison model) for users to manipulate and generate text directly from within BigQuery. All inference processing actually occurs in Vertex AI, while using data from BigQuery. It facilitates the inference without having to learn the more complex Vertex API.

In this example we will work with a public dataset table from BBC news called ‘bigquery-public-data.bbc_news.fulltext’, on which we will apply LLM foundation model to summarize the BBC article in less than 100 words to serve as TL;DR.

Create an External Connection (BQ Connection API must be enabled) and note the service account created for the connection.

bq mk --connection --location=us --project_id=my_project \
    --connection_type=CLOUD_RESOURCE vertex_genai_conn

# retrieve the service account ID from the created connection
bq show --connection my_project.us.vertex_genai_conn

Grant the “Vertex AI User” role to the connection’s service account in order to use resources on Vertex AI.

# grant Vertex AI User role to the connection’s service account
gcloud projects add-iam-policy-binding my_project --member='serviceAccount:bqcx-458398081798-tab4@gcp-sa-bigquery-condel.iam.gserviceaccount.com' --role='roles/aiplatform.user'

CREATE MODEL `my_project.temp.llm_model`
REMOTE WITH CONNECTION `my_project.us.vertex_genai_conn`
OPTIONS (remote_service_type = 'CLOUD_AI_LARGE_LANGUAGE_MODEL_V1');

Run inference by constructing the prompt and then send it to the LLM model. The STRUCT contains LLM parameters that can be tuned, such as temperature, top_p and top_k. See parameter guidance here.

SELECT 
  prompt,
  ml_generate_text_result['predictions'][0]['content'] AS generated_text,
  ml_generate_text_result['predictions'][0]['safetyAttributes'] AS safety_attributes
 FROM
ML.GENERATE_TEXT (
  MODEL `my_project.temp.llm_model`,
  (SELECT CONCAT ("Please provide a summary of the following article in less than 100 words: ", body) AS prompt
FROM `bigquery-public-data.bbc_news.fulltext` limit 10),
STRUCT ( 0.2 AS temperature,
  512 AS max_output_tokens,
  0.8 AS top_p,
  40 AS top_k));

-- The STRUCT contains LLM parameters that can be tuned, such as temperature, top_p and top_k.

Sample results are shown below, where the generated text is the summary provided by the large language model. We can then use the result to further activate reporting insights or reverse ETL to feed other operational applications

Resource:
- Documentation: How to generate text by using a remote model in
- Documentation: Understand how to use LLM tuning parameters.

2. BigQuery Studio: data ingestion to generative AI

BigQuery Studio provides a unified and collaborative interface for data practitioners to accelerate data to AI workflows: from data ingestion, preparation to data exploration and ML inference. It’s powered by Colab Enterprise, which is a managed notebook environment with enterprise-level security and compliance support capabilities. We can use SQL, Python, other common packages/APIs, or even natural language directly (powered by Duet AI, see more below) within BigQuery Studio. This allows us to easily access generative AI models to perform tasks such as text processing, sentiment analysis, entity extraction, and many more. We can run the notebook in a serverless manner with fully managed infrastructure, or create a specific runtime to be shared by multiple notebooks.

In the example below, we have house descriptions stored as unstructured text files on Google Cloud Storage. We explore how to read the description and call the LLM model to extract entities, all done within BigQuery. Here we use the BQML integration to call LLM model in the notebook, but it’s also possible use LangChain Python SDK instead.

Create a new notebook in BigQuery by selecting it from the top panel. It’s also possible to create a new notebook on Colab Enterprise page, which will be visible automatically in BigQuery.

Connect to a runtime, either to a default (fully managed) runtime, an existing runtime, or a new runtime.

Grant the notebook the necessary resource permissions using the logged in user account (or implicitly to the group or domain at higher hierarchy).
In BigQuery Studio, we can now create operations across the entire data lifecycle. Using object tables from BigQuery, we can get a structured record interface for the unstructured data stored in Google Cloud Storage.

We then can call LLM model using BQML integration to extract important entities from the house description.

Here’s the sample of inference results.

Input prompt:
Extract important entities from the phrase below, focusing on important aspects of a house including space, rooms, kitchen, state, facilities and other features interested by buyers:

Located on this popular road in residential West Dulwich and offered to market chain free is this deceptively spacious three bedroom end of terrace early 20th century family home.
The property offers huge scope to extend STPP and benefits from a generous rear garden as well as close proximity to transport links, shops, cafes and restaurants as well as highly regarded independent and state schools.

Output:
- 3 bedrooms
- end of terrace
- early 20th century
- spacious
- scope to extend
- generous rear garden
- close to transport links, shops, cafes and restaurants
- highly regarded independent and state schools

Note that BigQuery Studio is also powered by Duet AI assistance, hence you can use the inline code completion suggestions to complete coding faster. For example, I could easily generate Pytorch codes that load a mnist dataset and train a model to classify images, all done within seconds thanks to generative AI.

Resource:
- Article: Announcing BigQuery Studio
- Documentation: Using notebooks in BigQuery
- Article: Building Generative AI applications with PaLM API and LangChain

3. Duet AI in BigQuery

Calling generative AI models to process the data in BigQuery allows us to explore new use cases, but it is just one way of harnessing the power of generative AI. BigQuery also comes with Duet AI, which is an generative AI-powered collaborator that provides assistance and new development experience to help boost developer productivity.

To use the Duet AI features in BigQuery, you need to sign up for GCP Duet AI preview, and then enable Duet AI from the BigQuery console.

SQL Completion

Duet AI provides intelligent, context-aware code completion that can help to reduce the time spent on writing repetitive code, while enhancing the code quality.

For example while writing SQL codes, the code completion suggestion is shown in gray.

Duet AI may suggest several options, which we can browse through when mouseover.

SQL Generation

Developers can describe the task they have in mind using a natural language input (in a comment with # sign), Duet AI attempts to provide an accurate and contextually appropriate SQL statement, which can be reviewed and modified.

Example below shows the instruction in a comment line. After pressing <Enter>, SQL code generation is show. Simply tab to accept the suggestion. If you mouseover the suggestion, other options may be shown.

Chat assistance

Duet AI provides a chat assistance that users can engage to get real-time guidance on various topics, such as asking for guidance or get detailed implementation on instructions given. It can also provide architectural or coding best practices, helping to reduce the need to go searching for relevant documents.

The example below asks for code assistance on how to write a specific SQL.

We can also ask Duet AI to explain SQL codes to get better insights in natural language, this is especially useful if we want to understand complicated codes written by others. Simply highlight the SQL codes and click on the Duet AI icon that appears, the Duet AI pane will show the detailed explanation on the right side.

Resource:
- Documentation: Write queries with Duet AI assistance
- Article: Duet AI across Google Cloud

4. Calling custom generative AI models

BQML integration with foundation models makes it easy to use generative AI, but it only allows us to invoke remote models that are exposed via published API, such as PALM text-bison or PALM embedding. We will need to wait for API updates before we can use the latest supported models. Sometimes, we also want to call custom models from third parties or from which we have fine-tuned. Hence we need more customized ways of calling generative AI models within BigQuery.

Option 1: Call remote models on Vertex AI endpoint

BigQuery ML has had the capability to import TensorFlow models within BigQuery for inference, but generative AI models are rather large to be imported, and not all generative AI models are Tensorflow based. We can actually register remote models from Vertex AI endpoints, and then serve predictions within BigQuery using the ML.Predict command.

Make sure the model is registered in Vertex AI Model Registry, and gets deployed to a Vertex AI Endpoint.
Create a BigQuery Cloud resource connection and note down the service account.

bq mk --connection --location=us --project_id=my_project \
    --connection_type=CLOUD_RESOURCE vertex_remote_model

# retrieve the service account ID from the created connection
bq show --connection my_project.us.vertex_remote_model

Grant Vertex AI User role to the connection’s service account in order to use resources on Vertex AI.

# grant Vertex AI User role to the connection’s service account
gcloud projects add-iam-policy-binding my_project --member='serviceAccount:bqcx-458398081798-tab4@gcp-sa-bigquery-condel.iam.gserviceaccount.com' --role='roles/aiplatform.user'

Create a remote model using a connection (replace the ‘ENDPOINT_URL’ to the model serving endpoint URL).

CREATE OR REPLACE MODEL `my_project.bqml_tutorial.bert_sentiment`
    INPUT (text STRING)
    OUTPUT(scores ARRAY<FLOAT64>)
    REMOTE WITH CONNECTION `my_project.us.vertex_remote_model`
    OPTIONS(endpoint = 'ENDPOINT_URL');

Get predictions from the remote model using the ML.PREDICT function.

SELECT *
FROM ML.PREDICT (
    MODEL `my_project.bqml_tutorial.bert_sentiment`,
    (
        SELECT review as text
        FROM `bigquery-public-data.imdb.reviews`
        LIMIT 10000
    )
)

Resources:
- Documentation: Make predictions with remote models on Vertex AI
- Documentation: Register the model in the Vertex AI Model Registry
- Documentation: How to tune language foundation models

Option 2: Call generative AI model via BigQuery remote function

Remote Functions, which are user-defined functions (UDF) that are written and hosted in Cloud Functions or Cloud Run, allow us to extend BigQuery SQL with custom codes. That means we can perform any custom logic using Node.js, Python, Go, Java, NET, Ruby, or PHP. This allows us to call APIs of custom generative models on any host, as long as the API is accessible to the caller.

Assuming that we have a Llama 2 model from Meta deployed on Vertex AI endpoint, we can then call it from within a Cloud Function.

Create the HTTP endpoint in Cloud Functions or Cloud Run, that contains codes of an API call to the Llama 2 model. BigQuery will send HTTP POST requests to this endpoint.
Create a BigQuery connection that points to the HTTP endpoint created in the previous step.
Assign necessary permission to the connection’s service account to call the Cloud Run or Cloud Function endpoint.
Create a remote function in BigQuery (replace the ‘HTTP_ENDPOINT_URL’ to the remote function URL).

CREATE FUNCTION my_project.my_dataset.remote_summarize(x STRING) RETURNS STRING
REMOTE WITH CONNECTION my_project.us.vertex_remote_model
OPTIONS (
  endpoint = '<HTTP_ENDPOINT_URL>'
);

Use the remote function in a query just like any other user-defined functions.

SELECT house_description, remote_summarize(house_description) AS summary 
FROM my_project.my_dataset.house;

Resource:
- Documentation: Make predictions with remote models on Vertex AI

Summary

Generative AI is a rapidly developing space in GCP, new features and improvements are constantly introduced as we speak, so kindly validate the content before trying out. Generative AI helps to democratize machine learning capabilities in almost every vertical and horizontal use case. Now developers and data engineers also can harness the power of generative AI without being a data scientist. By automating tasks, improving productivity, and creating new insights, generative AI has the potential to revolutionize the way data analytics is performed today, and to help get more value from data easier and faster.