Query Multiple Google Cloud Databases using Federated Queries (Part -1 : Cloud SQL)

Sudharma Mokashi
Google Cloud - Community
4 min readFeb 24, 2023

In this blog we will see how we can query multiple Google Cloud Databases using Federated Queries. This is the first part of the blog where we will look into basics of Federated Queries and then how we can run these queries using BigQuery and Cloud SQL.

Part 1 : Google Cloud Federated Queries with Cloud SQL

Part 2 : Google Cloud Federated Queries with Cloud Spanner

Before we begin, let’s understand what is Federated Queries and where it can be useful.

What are Federated Queries ?

  • Federated means — to form or be formed into a single centralised unit, within which each state or organisation keeps some internal autonomy.
  • Federated queries are useful when you need to join data from multiple sources that are not physically located in the same place or within the same database.

Now we know the meaning and use of federated queries, let’s quickly see how Google Cloud Supports this federated query option and what are the pre-requisites to perform federated queries on GCP.

Federated Query from BigQuery to CloudSQL

BigQuery to CloudSQL Federated Query

Steps

  • Send a query statement using EXTERNAL_QUERY function to Cloud SQL.
  • The query statement is performed using BigQuery Connection API.
  • Cloud SQL returns result back as a temporary table.

Pre-requisites to execute Federated Queries in GCP

  1. GCP CloudSQL instance — Cloud SQL instance would be required to fetch data from one of the table which exists on the instance.
  2. GCP BigQuery Admin User Access — bigquery.admin access would be required to create a connection resource in BigQuery.
  3. GCP BigQuery additional User (optional) — Admin user will need to provide access of the connection resource to additional user/s who might be using the connection resource.

And That’s it, These are the accesses and resources which are required to perform federated query.

Before we jump into the demo, please check the list of supported regions for performing the federated query operation.

Federated queries are only supported in regions that support both the external data source and BigQuery.

Now, It’s DEMO time

YAY !
  • Step — 1 : Let’s check the data which is available in Cloud SQL instance by querying the the passenger table. Currently we have 10 rows available in the passenger table and we try to query this data from BigQuery.
Data From CloudSQL
  • Step — 2: Enable the BigQuery Connection API, click here to enable.
Enable BigQuery Connection API
  • Step 3 :— Go to BigQuery and click on Add data to add the external data source.
Click on ADD DATA
  • Step 4 : — Copy the Cloud SQL connection name, database name, database region, username and password from the Cloud SQL overview page.
  • Step 5 : — Fill in all the details for external data sources, for this demo we are using Cloud SQL’s postgres flavour. Click on create connection.
Add External Data Source
  • Step 6 : — Once the external source starts appearing in the datasets click on query.
Click on Query to query Cloud SQL database
  • Step 7 : — Let’s query our Cloud SQL data from passengers table and see the results in BigQuery.
Federated Query execution on Cloud SQL

and here we got all the records from our table.

  • Step 8 : — We can also check the number of tables which are available in the particular database by executing below shown query.
Query to check available tables from Postgres

And that’s it. We successfully executed federated queries on Cloud SQL Postgres environment from BigQuery. Stay tuned for the upcoming blog where we will see how the federated queries can be executed on Spanner. Thanks for reading !

Additional References :

--

--