Olejniczak Lukasz
Google Cloud - Community
5 min readJun 29, 2024

--

Just used BigQuery and Gemini 1.5 to analyze call center recordings with a single SQL SELECT statement. 🤯 It’s that simple:

STEP 0:

Upload your audio recordings to Cloud Storage. For this example, let's assume you've placed your audio files in the bucket

gs://speech2textrecordings/

STEP 1:

Create Object Table in BigQuery. Object Tables provide a metadata index over the unstructured files in a specified Cloud Storage bucket. Each row of the table corresponds to an object, and the table columns correspond to the object metadata generated by Cloud Storage. Thanks to object tables you are able to run s SELECT * FROM and list all objects in the corresponding bucket:


CREATE OR REPLACE EXTERNAL TABLE `genai-app-builder.speech2textdataset.recordings`
WITH CONNECTION `genai-app-builder.us-central1.speech2text`
OPTIONS(
object_metadata = 'SIMPLE',
uris = ['gs://speech2textrecordings/*'],
max_staleness = INTERVAL 1 DAY,
metadata_cache_mode = 'MANUAL'); ##Set to AUTOMATIC for the metadata cache to be refreshed at a system-defined interval, usually somewhere between 30 and 60 minutes.

If you’re curious about the mysterious “CONNECTION” keyword then object tables use access delegation, to decouple access to the object table from access to the Cloud Storage objects.

You’ve hit the nail on the head. Why is all these? An external connection in BigQuery acts as a bridge between your data warehouse and external data sources like Cloud Storage. It’s a crucial component that simplifies data access and management, providing several key benefits. First of all it helps you specify a service account (technical user) that will represent this object table when making connections to Cloud Storage. As a result you only have to grant your users access to the table and NOT the underlying storage bucket. This will also help you enforce row-level security and manage which rows (files from the storage bucket) can be accessed by which user. Very powerful mechanism.

How to create such a CONNECTION objects:

In next step you need to select [Vertex AI remote models, remote functions and BigLake] and just provide friendly name to this CONNECTION object:

You will see all your connections listed in BigQuery and you can also check service account created to represent this connection:

STEP 2:

Next, create a BigQuery Model object to represent Gemini 1.5 Flash. Be sure to include the CONNECTION object, which specifies the service account that will make calls to Gemini 1.5 Flash. Same mechanism as we explained with Object Tables in STEP 1, thanks to this mechanism you only need to grant users access to the BigQuery model object.

CREATE OR REPLACE MODEL
`genai-app-builder.speech2textdataset.vertexaigemini15flash`
REMOTE WITH CONNECTION `genai-app-builder.us-central1.speech2text`
OPTIONS (ENDPOINT = 'gemini-1.5-flash');

The generative AI model linked to a BigQuery model object is determined by its ENDPOINT attribute. While Gemini-1.5-flash is one option, it’s not the only one. Numerous other models are available for use:

STEP 3:

You are ready to analyze your recordings. It is as simple as calling ML.GENERATE_TEXT function which needs to know:

  • which model you want to use
  • which table you want to process (it can also be SQL query)
  • prompt

When it comes to prompt..

We will instruct Gemini to play the role of contact center manager who analyzes audio recordings of customer calls. We want to ask it to extract positive and negative feedback from these calls, and classify the overall sentiment of the conversation. Finally, we want it present the results in JSON format:

SELECT * 
FROM ML.GENERATE_TEXT(
MODEL `genai-app-builder.speech2textdataset.vertexaigemini15flash`,
TABLE `genai-app-builder.speech2textdataset.recordings`,
STRUCT(
0.2 AS temperature, 2048 AS max_output_tokens, 0.2 AS top_p,
15 AS top_k, TRUE AS flatten_json_output,
"Role: You are contact center manager. \n Task: Extract elements with positive and negative feedback from audio file. \n Respond in polish. \n Output MUST BE JSON with keys {positive_feedback, negative_feedback, sentiment}. \n"
as prompt
)
)
;

When you execute this code, you get the desired result. To scale this to thousands of recordings, simply upload more videos to Cloud Storage!

```json
{
"positive_feedback": [
"Kierowca przybył na miejsce punktualnie o umówionej godzinie.",
"Samochód był czysty i zadbany.",
"Kierowca był bardzo uprzejmy i pomocny.",
"Kierowca był bardzo rozmowny i opowiadał mi o różnych miejscach, które mijaliśmy.",
"Kierowca był bardzo ostrożny i przestrzegał przepisów ruchu drogowego.",
"Czułem się bardzo bezpiecznie i komfortowo podczas jazdy.",
"Kierowca pomógł mi z bagażem i odprowadził mnie do drzwi.",
"Byłem bardzo zadowolony z jego usług."
],
"negative_feedback": [],
"sentiment": "positive"
}
```

With the integration of BigQuery and Gemini 1.5, all BigQuery users now have GenAI superpowers to use their SQL skills to understand and extract insights from unstructured data like documents, audio, video, and images. Let me know in comments how it worked for your use case!

This article is authored by Lukasz Olejniczak — Customer Engineer at Google Cloud. The views expressed are those of the authors and don’t necessarily reflect those of Google.

Please clap for this article if you enjoyed reading it. For more about google cloud, data science, data engineering, and AI/ML follow me on LinkedIn.

--

--