GCP PostgreSQL — Compute Engine Snapshot Based Replication (Eg: Production to Development)

Quick Summary (TL;DR)

In this post, will cover a real life walkthrough and recommendations of GCE Snapshot based replication on Google Cloud Platform. This solution is relevant for database administers and application teams, with reliance on databases hosted on GCE VMs. This solution leverages cloning disks via snapshot. Same concept can be applicable for other databases too.

  1. Create snapshot of production database disk
  2. Create disk image from snapshot
  3. Create instance template
  4. Make Rolling Update to managed instance group (in development project) with new instance template.

Introduction

Sometimes due to security and compliance requirements, development teams are not allowed to have direct access to production database. However, in order to debug specific production issues or create adhoc reports development teams may need to be able to access data.

This sometimes results in raising tickets to operations for taking database backups and restores in development environment. The process is often slow and frustrating. Also often taking a logical backup often requires downtime.

Experience and productivity can be greatly improved if scheduled script or pipeline is created which refreshes data every few hours. While there can be other solutions like streaming replication, logical backup & restore, this is an alternative solution.

Advantages

  1. Improved security by giving developers a separate copy of production data
  2. Time saved by automation, no more waiting for tickets
  3. No production database downtime required
  4. Faster bug fixes and development cycles

Deployment Layout

The above deployment diagram shows a typical setup, which is lay out in following way:

  1. Production database VMs are in separate prod project.
  2. May or may not have separate VPC networks for prod and non-prod.
  3. Production PostgreSql’s data disk is periodically snapshotted.
  4. On non-prod environment, Managed Instance Group (MIG) is created. And new data rollout happens via rolling update feature.

Example Implementation

For simplicity, I will be using:

  1. One single project containing both source database (assume prod db) and replica (for non-production use).
  2. The postgres database vm will have only one disk for boot, data and logs. In a real world scenario it is better to replicate only data disk to non-prod so that database users, log and configurations (pg_hba.conf and postgres.conf etc) are not brought to non production environment.
  3. Managed Instance Group to do a rolling update with new database snapshot.
  4. Create load balancer for Managed instance group created in above step.
  5. An orchestrator VM with scheduled cron job to trigger a pipeline. It will execute in following sequence of commands:
    a. Create source disk snapshot
    b. Create a new disk image from snapshot.
    c. Create new instance template with new disk image.
    d. Do a rolling update with new snapshot on MIG.

Let’s get started

If you have not used Google Cloud, you can head on over to https://console.cloud.google.com and register for a free account starting with $300 in credit.

Setup Source Database and Managed Instance Group

  1. Open cloud shell

2. Create Firewall rule for postgres instance and create a new VM instance.

Create Firewall (optional with default VPC Network)
Execute below command to create firewall rule which allows incoming connections on port 5432

gcloud compute firewall-rules create pg-firewall-open --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:5432 --source-ranges=0.0.0.0/0

Create VM
VM instance can be created either by UI or as gcloud command as below.
Alternatively execute below command from cloud shell.

gcloud compute instances create postgres-prod --zone=us-west1-b --machine-type=n1-standard-1 --subnet=default  --scopes=https://www.googleapis.com/auth/cloud-platform --image-family=ubuntu-1604-lts --image-project=ubuntu-os-cloud --boot-disk-size=100GB --boot-disk-type=pd-standard --boot-disk-device-name=postgres-prod

3. SSH to the VM and install and configure PostgreSQL database.

Install PostgreSQL

gcloud compute ssh postgres-prod --zone=us-west1-b
sudo apt update
sudo apt install postgresql postgresql-contrib

Configure PostgreSQL to allow remote connection

By default PostgreSQL is configured to be bound to “localhost”.
Configuring postgresql.conf, I usually do a search to find it.

$ find \ -name "postgresql.conf" /etc/postgresql/9.5/main/postgresql.conf

Open postgresql.conf file and replace line

listen_addresses = 'localhost'

with

listen_addresses = '*'

Configuring pg_hba.conf

Similarly find and open pg_hba.conf and add following entry at the very end.

host    all             all      0.0.0.0/0     md5
host all all ::/0 md5

4. Create sample database.

sudo -u postgres psql

Once you are in the psql shell, create a new database.

create database prod_test_1;

Exit psql shell

\q

Exit postgres-prod VM

exit

5. Create disk snapshot.

Name snapshot as: postgres-prod-initial-snapshot

6. Create disk image from snapshot as below.

7. Create a instance template with name postgres-prd-replica-initial-template as below

8. Create a managed instance group using above template with name postgres-prd-replica-mig. Compute Engine -> Instance Groups -> Create Instance Group.

9. Now we shall create a load balancer. So that it provides one non changing IP which could be used to connect to the database. Network Services -> Load Balancing -> Create Load Balancer -> TCP Load Balacing -> Start Configuration

Here we will configure load balancer. Name it as postgres-prod-replica-lb and configure as shown in screen shot below.

Configure Health Check as below.

Configure front end as below and click Create button.

This should create a load balancer with an internal ip which can be used by developers/applications to connect.

Also you will need to create a health check firewall rule to enable the same.

gcloud compute firewall-rules create allow-hc --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp --source-ranges=130.211.0.0/22,35.191.0.0/16

Users shall connect to this load balancer to access prod replica database.

Manually updating database snapshot replica

To update latest copy of data, all we need to do is as below.
disk snapshot -> image -> instance template -> rolling update

  1. Create disk snapshot (using UI as before or gcloud as below)
gcloud compute disks snapshot postgres-prod --snapshot-names=postgres-prod-manual1-snapshot --zone=us-west1-b

2. Create disk image (using UI as before or gcloud as below)

gcloud compute images create postgres-prod-manual1-image --family=postgres-prod --source-snapshot=postgres-prod-manual1-snapshot

3. Create instance template (using UI as before or gcloud as below)

gcloud compute instance-templates create postgres-prd-replica-manual1-template --machine-type=n1-standard-1 --scopes=https://www.googleapis.com/auth/cloud-platform --image=postgres-prod-manual1-image --boot-disk-size=100GB --boot-disk-type=pd-standard

4. Rolling update to managed instance group

Go to Compute Engine -> Instance Groups -> postgres-prd-replica-mig -> Rolling Update

Then select the new template, change the maximum unavailable to 0 and click update. This will create a new replica instance first and then remove the old one.

Alternatively this can be achieved with following gcloud command.

gcloud beta compute instance-groups managed rolling-action start-update postgres-prd-replica-mig --version template=postgres-prd-replica-manual1-template --max-unavailable 0 --zone us-west1-b

Orchestrating periodic automatic updates

It can be easily done using any of various third party deployment tools (like jenkins). However, here I will simply create a shell script and schedule it as cron tab.

I will be generating a uuid (using uuidgen command) to supply unique name to all resources. Also I have shortened name of resources as they were exceeding gcp character limits.

Create a vm with name prd-replica-updater

gcloud compute instances create prd-replica-updater --zone=us-west1-b --machine-type=n1-standard-1 --subnet=default  --scopes=https://www.googleapis.com/auth/cloud-platform --image-family=ubuntu-1604-lts --image-project=ubuntu-os-cloud --boot-disk-size=10GB --boot-disk-type=pd-standard --boot-disk-device-name=prd-replica-updater

SSH into VM and create shell script with name prd-replica-update.sh as below

# uuid string
rnd=$(uuidgen)
# create snapshot
gcloud compute disks snapshot postgres-prod --snapshot-names=pg-prd-auto-$rnd-snap --zone=us-west1-b
# create image
gcloud compute images create pg-prd-auto-$rnd-image --family=postgres-prod --source-snapshot=pg-prd-auto-$rnd-snap
# create instance template
gcloud compute instance-templates create pg-prd-auto-$rnd-template --machine-type=n1-standard-1 --scopes=https://www.googleapis.com/auth/cloud-platform --image=pg-prd-auto-$rnd-image --boot-disk-size=100GB --boot-disk-type=pd-standard
# do rolling update
gcloud beta compute instance-groups managed rolling-action start-update postgres-prd-replica-mig --version template=pg-prd-auto-$rnd-template --max-unavailable 0 --zone us-west1-b

Above script can be scheduled using crontab and thus automating database updates.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Shashank Agarwal

Shashank Agarwal

Database Migrations Engineer, Google Cloud