Query Multiple Google Cloud Databases using Federated Queries (Part -2 : Cloud Spanner)

Sudharma Mokashi
Google Cloud - Community
4 min readMar 17, 2023

In this blog we will see how we can query multiple Google Cloud Databases using Federated Queries.

This is the second part of the blog where we will look into how we can run federated queries using BigQuery and Cloud Spanner.

If you are not aware about federated queries or checking federated queries with Cloud SQL, I will highly recommend you to go through the first part of this blog.

Part 1 : Google Cloud Federated Queries with Cloud SQL

Part 2 : Google Cloud Federated Queries with Cloud Spanner

Let’s look into some basic steps which are performed when we query from BigQuery to Cloud Spanner.

Steps

  • Send a query statement using EXTERNAL_QUERY function to Cloud Spanner.
  • The query statement is performed using BigQuery Connection API.
  • Cloud Spanner returns result back as a temporary table.
BigQuery to Cloud Spanner Federated Query

Pre-requisites to execute Federated Queries in GCP

  1. GCP CloudSpanner instance — Cloud Spanner instance would be required to fetch data from one of the table which exists on the instance.
  2. Cloud Spanner Database Admin — Cloud Spanner Database Admin access would be required on your project.
  3. GCP BigQuery Admin User Access — bigquery.admin access would be required to create a connection resource in BigQuery.
  4. GCP BigQuery additional User (optional) — Admin user will need to provide access of the connection resource to additional user/s who might be using the connection resource.

And That’s it, These are the accesses and resources which are required to perform federated query.

BigQuery Spanner federation enables BigQuery to query data residing in Spanner in real-time, without copying or moving data.

Before we jump into the demo, please check the list of supported regions for performing the federated query operation.

Federated queries are only supported in regions that support both the external data source and BigQuery.

Let’s look into the DEMO now,

  • Step — 1 : Let’s check the data which is available in Cloud Spanner instance by querying the the passenger_dtl table. Currently we have 10 rows available in the passenger_dtl table and we will try to query this data from BigQuery.
Querying Spanner Data
  • Step — 2 : Enable the BigQuery Connection API, click here to enable.
Enable BigQuery API
  • Step 3 : — Go to BigQuery and click on Add data to add the external data source.
Click on ADD DATA

Step 4 : — Copy the project id, instance id and the database name from Cloud Spanner console and create a string in the following format.

projects/PROJECT_ID/instances/INSTANCE/databases/DATABASE

add the details of external data source(Cloud Spanner).

Add Details

Step 5 : — Once the external source starts appearing in the datasets click on query.

Query External Data Source
  • Step 6: — Let’s query our Cloud Spanner data from passenger detail table and see the results in BigQuery.
Querying Spanner Data from BigQuery

and here we got all the records from our table.

And that’s it. We successfully executed federated queries on Cloud Spanner Postgres environment from BigQuery.

Thanks for reading !

Additional References :

--

--