Jack Senechal
Oct 10, 2018 · 4 min read

How we migrated our Postgres database to Google Cloud SQL with minimal downtime.

In a recent post we described the migration of Mavenlink’s integrations platform from Heroku to Kubernetes running on Google Kubernetes Engine. Part of that process involved migrating 35G of data from Heroku’s hosted Postgres service to Google’s Cloud SQL service.

We were able to complete the majority of the migration of our production app to Kubernetes without tangible downtime. We did this by spinning up new worker and web pods on the Kubernetes cluster and pointing them to the original database. We could then point the DNS to the Google Cloud load balancer, and process incoming traffic seamlessly on either system while that propagated.

Finding the Right Approach

Option 1: Replication with Seamless Cutover

Our first thought for migrating the Postgres database was replication. In Heroku we had a primary database and several followers, giving us both redundancy and the ability to keep the app up while the infrastructure was rotated or upgraded. In theory we could create a replica in the destination system following the primary in the source system, and then switch our app to use the replica as a primary with no interruption. Unfortunately, Cloud SQL doesn’t support replicating from external databases so we had to table that option.

Option 2: Dump, Transfer, and Import

Next, we shifted our focus to transferring the data quickly so that the downtime required would be minimal. We considered a few options. Google has some helpful documentation for exporting and importing data to Cloud SQL, including a particularly useful command to format a dump correctly for use with gcloud sql import.

pg_dump -U [USERNAME] --format=plain --no-owner --no-acl [DATABASE_NAME] | sed -E 's/(DROP|CREATE|COMMENT ON) EXTENSION/-- \1 EXTENSION/g' > [SQL_FILE].sql

The rub there for a production migration is that we’re still stuck with a three-step operation, each with its own transfer.

  1. First, we would export the data using pg_dump. This could be optimized by running the export as close to the database as possible, like in a Heroku dyno in our case.
  2. Then we have to transfer the dump to Google Cloud Storage, which is the only storage device from which Cloud SQL can import. If you controlled the source system you could use GCS FUSE to mount your GCS bucket into the filesystem, export directly there, and eliminate this step.
  3. Finally, we would run the import. This is presumably quite optimized because Google owns both the storage bucket and the SQL server. There may be some delay here before the import is kicked off as well. They don’t specify import performance in their SLA, but there is a ‘pending’ status. This is probably minimal, but in a production migration we want to control as many variables as possible.

This process would have worked, but the multiple steps provide more points of failure, and redundant transfers.

Option 3: Streaming Dump and Import

The third option, and the one we chose to employ, was to do a streaming export / import by piping pg_dump into pg_restore. We actually used the Heroku CLI's pg:pull command, which is a nice wrapper around pg_dump, essentially boiling down to:

pg_dump --verbose --format=custom --compress=0 | pg_restore --verbose --no-acl --no-owner

This approach does the transfer in one step, using Postgres’ custom archive format. Compression is turned off because with plenty of bandwidth on the source and target systems, and the intermediary machine running the command, the overhead of compression would become the bottleneck.

We wanted to run this command on a high availability machine in the GKE cluster, keeping it as close to the database instance as possible. This makes the longest leg of our trip (and likely the main bottleneck to speed) the one-time transfer over the wire between the source database and the intermediary machine.

This achieved a transfer of our production data set, 35G, in around 50 minutes, or about 12MBps.

Dialing In the Details

To automate the process as much as possible we prepared a docker image based on ubuntu:18.04 with the psql command and heroku CLI.

FROM ubuntu:18.04
RUN apt update
RUN apt install -y curl postgresql-client gnupg tmux screen
RUN curl https://cli-assets.heroku.com/install-ubuntu.sh | sh
COPY db-migrator.sh /usr/local/bin/db-migrator

We injected the database credentials into the container as environment variables by setting them as Kubernetes secrets, and then created a simple bash script to use those secrets with heroku pg:pull.

#!/bin/bashheroku pg:pull $SOURCE_DB postgres://$PG_USERNAME:$PG_PASSWORD@$PG_HOST/$PG_DBNAME -a $SOURCE_APP

This allowed us to set up everything and be ready to go. When doing a mission-critical operation it’s a good idea to keep it simple and remove as many points of human error as possible. This way, when the time came to perform the actual migration, we could just kubectl exec into the container running on the cluster, and type db-migrator.

Mavenlink Product Development

Musings from the Engineering and Product Design team at Mavenlink

Jack Senechal

Written by

Mavenlink Product Development

Musings from the Engineering and Product Design team at Mavenlink

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade