This is Multimodal Analytics
BigQuery’s ObjectRef Explained (For Humans)
You know the problem already. You’ve seen it in meetings, in dashboards, in the quiet despair of data engineers at 2 a.m.: structured data lives over here — tidy rows and columns, relational perfection. And unstructured data lives over there — a swirling mess of images, audio files, PDFs, and God-knows-what-else dumped into cloud buckets.
And the two? They don’t talk. They don’t even make eye contact. It’s like some estranged couple at a dinner party, pretending the other doesn’t exist.
But then someone asks a seemingly simple question:
“Can we pull up all support cases for the brick phone where customer service call recordings suggest the user is exasperated about the battery no longer holding a charge, and photos clearly show the severely cracked screen with multiple lines of breakage radiating across its surface, suggesting it has been dropped or impacted?”
Simple? Sure. If you enjoy exporting CSVs, running a speech-to-text service, passing that into a Python pipeline to score sentiment, then cross-referencing with an image classifier for cracks… and finally praying it all comes back together in time for your boss’s Friday status update.
This entire, convoluted process — this attempt to force a conversation between your tables and your files — actually has a formal name.
Multimodal Analytics is the practice of integrating and analyzing data from multiple, distinct modalities to derive a single, holistic understanding.
The core objective is to bring structured data — the clean, organized rows and columns from relational databases — into direct conversation with the vast and varied world of unstructured data, which includes formats like images, audio files, video, and free-form text documents. By analyzing these disparate sources in concert, organizations can uncover complex patterns and contextual insights that are impossible to see when viewing each data type in isolation. It represents a fundamental shift from analyzing data about an event to analyzing the event itself.
1. Enter BigQuery’s ObjectRef
This isn’t a revolution — it’s more subtle than that. It’s like discovering a trapdoor in the floor of your tidy, structured database that opens directly into the chaotic cellar of your object storage. You don’t move the unstructured data up into your table (why would you?), but you create a reference — a lightweight STRUCT in BigQuery that says:
“The file’s in Cloud Storage. Here’s the URI. Here’s the secure connection (the authorizer) to access it. Here’s its version, so I don’t accidentally pull a different file tomorrow. Oh, and here’s some metadata if you’re curious.”
{
"uri": "gs://classic-phones-support/calls/ticket-7285.mp3",
"version": 2742590139195811,
"authorizer": "customer-support.us-central1.conn",
"details": {
"gcs_metadata": {
"content_type": "audio/mp3",
"md5_hash": "a3b2cd5g1f67190a112c3d1e5e41891",
"size": 2100000,
"updated": 124237905999403200
}
}
}This means you can now have a multimodal table — support ticket rows with structured fields like ticket_id and customer_name, sitting alongside an ObjectRef column pointing to the angry customer’s phone call recording or the blurry photo of a shattered screen.
And with ObjectRef, BigQuery suddenly becomes aware of these files. You’re no longer blind to the “other half” of your data.
2. The ObjectRef Toolbox
Now, let’s talk tools. BigQuery gives you a set of functions that make ObjectRef more than just a glorified pointer.
2.1 OBJ.MAKE_REF(uri, authorizer)
You use this to create an ObjectRef for any file in GCS. It’s like sticking a label on a box in the cellar so you can find it later without opening every box.
The uri argument is aSTRING value that contains the URI for the Cloud Storage object, for example, gs://classic-phones-support/tickets/7285.jpg.
The authorizer argument is aSTRING value that contains the Cloud resource connection that can be used to access the Cloud Storage object and it must be in the format location.connection_id. For example, us-central1.myconnection.
The output is an ObjectRef value that is a struct in the following format:
struct {
uri string, // Cloud Storage object URI
version string, // Cloud Storage object version
authorizer string, // Cloud resource connection to use for object access
details json { // Cloud Storage managed object metadata
gcs_metadata json {
"content_type": string, // for example, "image/png"
"md5_hash": string, // for example, "d9c38814e44028bf7a012131941d5631"
"size": number, // for example, 23000
"updated": number // for example, 1741374857000000
}
}
}Example:
SELECT OBJ.MAKE_REF(JSON '{"uri": "gs://classic-phones-support/tickets/7285.jpg", "authorizer": "us-central1.myconnection"}');2.2 OBJ.FETCH_METADATA(objectref)
Basically, it fetches the file’s metadata—content type, size, MD5 hash, last updated—so you can work with it intelligently.
The objectref argument is a partially populated ObjectRef value, in which the uri and authorizer fields are populated and the details field isn't.
The output is a fully populated ObjectRef value and the metadata is provided in the details field of the returned ObjectRef value.
Example:
SELECT OBJ.FETCH_METADATA(OBJ.MAKE_REF("gs://classic-phones-support/tickets/7285.jpg", "us-central1.myconnection"));2.3 OBJ.GET_ACCESS_URL(objectref, mode [, duration])
Generates signed URLs (read-only or read/write) so you can securely access or modify files without leaving BigQuery. Think of it as handing out house keys that expire in X units of time.
Theobjectrefis an ObjectRef value that represents a Cloud Storage object.
The modeargument lets you specify two possible values:r for read-only an object, and rwfor modifying the object.
The duration argument is optional and expects an INTERVAL value that specifies how long the generated access URLs remain valid. The output looks like this:
obj_ref_runtime json {
obj_ref json {
uri string, // Cloud Storage object URI
version string, // Cloud Storage object version
authorizer string, // Cloud resource connection to use for object access
details json { // Cloud Storage managed object metadata
gcs_metadata json {
}
}
}
access_urls json {
read_url string, // read-only signed url
write_url string, // writeable signed url
expiry_time string // the URL expiration time in YYYY-MM-DD'T'HH:MM:SS'Z' format
}
}Example:
SELECT OBJ.GET_ACCESS_URL(OBJ.MAKE_REF("gs://classic-phones-support/tickets/7285.jpg", "us-central1.myconnection"), 'r', INTERVAL 60 MINUTE) AS read_url;3. Where AI Walks In
And here’s where things get even more interesting. Once your structured and unstructured data are connected, you can bring in AI like it’s the third act twist. You can do it in two ways: using SQL and using Python.
3.1 Multimodal AI Using SQL
You can pass ObjectRefs to generative models with AI.GENERATE_TABLE(). Say you have a support ticket with audio from a customer, you can pass your audio_ref directly to a generative model. So now, instead of just knowing a call happened, you can ask Gemini to listen to it, summarize the complaint, and even gauge the customer’s sentiment:
SELECT
ticket_id,
complaint_summary,
customer_sentiment
FROM AI.GENERATE_TABLE(
MODEL `mydataset.gemini`,
(
SELECT
(
'Listen to this customer call. Provide a one-sentence summary of the issue and classify the customer sentiment as "Calm", "Annoyed", or "Exasperated".',
audio_ref
) AS prompt,
ticket_id
FROM `mydataset.support_tickets`
),
STRUCT('complaint_summary STRING, customer_sentiment STRING' AS output_schema)
);3.2 Multimodal AI Using Python
Over in the Python world, the same principles apply using BigQuery DataFrames. This is where the data scientist, who lives and breathes in notebooks, can shine. Instead of ObjectRefs, you work with blob columns — their conceptual cousin. Let’s say you want to analyze all those photos of cracked screens. You can use the bigframes library to ask a vision model to look at every single image and classify the damage, all without ever pulling the files to your laptop.
The key is that you pass both a text prompt and the image blob column to the model’s .predict() method. BigQuery handles the rest, running the analysis at scale in the background.
import bigframes.pandas as bpd
from bigframes.ml import llm
# 1. Create a DataFrame from all support images in a GCS bucket
df = bpd.from_glob_path(
"gs://classic-phones-support/images/ticket-*.jpg",
name="phone_image"
)
# 2. Load a multimodal model from Vertex AI
model = llm.GeminiTextGenerator(model_name="gemini-2.0-flash-001")
# 3. Ask the model to analyze the image from each row
prompt = """
Look at this image of a phone screen.
Describe the damage and classify its severity as 'Minor', 'Moderate', or 'Severe'.
"""
predictions = model.predict(df, prompt=[prompt, df["phone_image"]])
# The result is a new DataFrame with the AI-generated analysis
predictions[['phone_image', 'ml_generate_text_llm_result']].show()4. So What Does This All Mean?
It means the old wall between structured and unstructured data isn’t a wall anymore — it’s a window. You can join, filter, analyze, and even generate insights across both worlds, securely and at scale.
ObjectRef isn’t flashy. It doesn’t strut into the room wearing a superhero cape. But it changes the entire dynamic between your clean relational data and the messy, high-volume reality of unstructured files. It’s the moment your data stops being a house divided and finally starts a coherent conversation.
5. Recap: Why ObjectRef Matters
✅ Breaks data silos: Seamlessly join structured and unstructured data for unified analysis.
✅ AI-ready: Enables generative and predictive AI workflows directly within BigQuery.
✅ Governed & Secure: Leverages BigQuery connections and permissions for controlled access.
✅ Developer-Friendly: Works with familiar SQL for analysts and Python for data scientists.
Whether you’re enriching datasets with AI-generated content, enabling multimodal search, or scaling analytics across millions of files, ObjectRef makes it possible to treat structured and unstructured data as one cohesive dataset.
Suddenly, the question “Can we analyze all those images and audio files in BigQuery?” doesn’t inspire existential dread. It inspires… curiosity.
6. Here’s Something For That Curious Mind of Yours
- Analyze multimodal data in BigQuery
- BigFrames Multimodal DataFrame Colab Notebook Example
- Get insights from structured and unstructured data using the AI-capable BigQuery DataFrames package
- Analyze multimodal data in Python with BigQuery DataFrames
- Analyze multimodal data with SQL and Python UDFs
- ObjectRef functions
- Find more of what I do, here (because… why not?)
Thank You For Reading. How About Another Article?
If You’re Still Here, Follow Along Before the Robots Replace Me
If, by some miracle, you enjoyed this — first of all, are you feeling okay? And second, you could support the author’s fragile ego by enthusiastically clicking that little 👏 button. Not once, not twice — fifty times. Unless you’re reading this in your inbox, in which case you’ll have to drag yourself over to the original post on Medium and do it there. I know, it’s a whole thing. But think of it as cardio for your index finger.
And, uh… don’t forget to follow for more educational content, insights, and — well, whatever else he comes up with. I mean, it can’t hurt, right? Worst case, you learn something. Best case, you impress people at parties.

