Creating Cloud Spanner change stream and exporting change log to BigQuery

Vishal Bulbule
Google Cloud - Community
3 min readJan 27, 2023

Cloud Spanner

Cloud Spanner is one of the Relational database service in Google cloud.Cloud Spanner is native google cloud product. Cloud Spanner is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale, automatic, synchronous replication for high availability, and support for two SQL dialects: Google Standard SQL (ANSI 2011 with extensions) and PostgreSQL.

A change stream watches and streams out a Cloud Spanner database’s data changes — inserts, updates, and deletes — in near real-time.

Purpose of Change Stream

Replicating Spanner data changes to a data warehouse, such as BigQuery, for analytics.
Triggering application logic based on data changes sent to a message queue, such as Pub/Sub.
Storing data changes in Cloud Storage, for compliance or archival purposes.

prerequisites

  1. Create Spanner instance
  2. Create Spanner Database for which change data needs to be captured.
  3. Create Tables for testing ( I created Singers , Albums Table for testing)

How to create Change Stream

  1. Open Spanner Instance > Database > Query
  2. To create a change stream, you need to supply its name and the schema objects it will watch: either the entire database, or a list of specific tables and columns

3. To watch changes for complete database.

CREATE CHANGE STREAM <changestream_name> FOR ALL;

4. Now change stream is created and ready to capture changes. We need to create dataflow streaming job to stream Spanner changes to BigQuery table.

Create Dataflow Job

  1. Go to the Dataflow Create job from template page.
  2. In the Job name field, enter a unique job name.
  3. From the Dataflow template drop-down menu, select the Cloud Spanner change streams to Google Cloud Storage template.
  4. In the provided parameter fields, enter your parameter values.

5. Create Job and make sure Job reach to running state.

Test change stream

  1. Update,Insert or delete any value from table available in database for which change stream is created.

2. Validate table is created in BigQuery dataset and change records is inserted in table.

We successfully created change stream and streamed out data in Bigquery using Dataflow BigQuery Job.

Please refer below video for Demo.

Reference Link

https://cloud.google.com/spanner/docs/change-streams/manage

--

--