Migrating Data From MongoDB to PostgreSQL with No Downtime

John Gerhardt
Compass True North
Published in
4 min readOct 10, 2017
Photo by Gareth Davies on Unsplash

Why Migrate?

In the early days of Contactually, before the product was fully fleshed out and requirements were clear, MongoDB was chosen to house certain portions of our application’s data.

MongoDB has many valid use-cases, but ours did not fit, and a change needed to be made. In this case, that meant migrating data from our MongoDB cluster into our relational datastore, PostgreSQL. While conceptually this migration is easy to understand, our transition would have failed if it weren’t for this 3-step deployment plan.

Important Not to Disrupt Regular Product Usage

As we discussed the primary goals for the migration, our number one priority was to make sure that users would suffer neither downtime nor degradation of the product.

Additionally, we wanted to make sure that any changes during the migration would not be missed. During a few white-boarding sessions, we came up with a strategy that allowed us to accomplish these main goals.

High Level Strategy — Duplicate Writes with a Series of Deployments

So — how did we do it? Our high level strategy was to write to both systems simultaneously, with a series of deployments to eventually start reading from the new system.

Create New PostgreSQL Schema

The first step was to compare the schema (or most recent schema) of the Mongoid class that we intended on migrating, and come up with an appropriate PostgreSQL table schema. In some cases, this wasn’t too disruptive as we were using PostgreSQL v9.6.x, so JSON columns were an option when strings, integers, datetime, or text columns wouldn’t suffice.

In many cases, this was also a chance for us to clean up the older schema, removing data or attributes that were no longer needed. Projects like these are a great chance to pay down some technical debt while you’re under the hood.

Deployment 1: Write to Both Systems, Read from Mongo

In our initial deployment, we wrote the new ActiveRecord class, a migration to create new tables, and a migration job to write to both databases. In order to make sure we didn’t miss any new data, we added callbacks to the Mongoid class that would enqueue the migration job.

One important point to make here is that the migration job needs to be both performant, as well as idempotent. This is a best practice for any background job, but is especially important for a migration. We achieved this using ActiveRecord’s import, which uses PostgreSQL’s UPSERT syntax and ON CONFLICT clause.

class CreateDataSources < ActiveRecord::Migration
def up
create_table :data_sources do |t|
t.string :remote_id
t.integer :status
t.datetime :status_set_at
t.json :raw_response
t.string :mongo_data_source_id
t.timestamps
end

add_index :data_sources, :remote_id
add_index :data_sources, :mongo_data_source_id
end

def
down
drop_table :data_sources
end
end

class NewDataSource < ApplicationRecord
table_name :data_sources

validates
:remote_id, :presence => true
end
class MigrateDataSource
include Sidekiq::Worker
sidekiq_options :queue => :one_off, :retry => 3

def perform(mongo_data_source_ids)
# Find Mongoid records
# Find ActiveRecord objects that have already been migrated
# Build objects in memory that need to be created/updated
# Bulk insert/update records using ActiveRecord import
end
end
class DataSource
include Mongoid::Document
include Mongoid::Timestamps

after_save :enqueue_migration

def enqueue_migration
Sidekiq::Client.enqueue(MigrateDataSource, id.to_s)
end
end

Deployment 2: Read/Write from New ActiveRecord Class

After this initial deployment, we enqueued around 150,000 jobs to migrate all existing records in batches of 1,000. We knew this might take 12–36 hours based on the size of the collection, the load on each database, and the resources allocated to each database. We used this time to write the new code that would both read and write from the new ActiveRecord class.

Once the migration was complete, we waited for an off-peak time range, and deployed the new code. One important distinction here is that we left the old collection and class in place in case we caught any issues in which we’d quickly need access to the old data. Naturally you would want to avoid a disaster recovery scenario such as rolling back. We were able to avoid any disasters with a solid plan, thorough pull request reviews, dry-runs in a staging environment, and triple-checking any move we were about to make.

Having tools in place to monitor the performance and error rates of the migration code and new code is critical to determining whether or not the migration was a success. At Compass, we use Datadog in order to monitor application error rates.

Deployment 3: Remove Old Mongoid Class and Mongo Collection

The last deployment is arguably the most fun step of the migration. Once the dust has settled, performance looks good, no app issues, and error rates are reasonable, it’s time to clean up.

We made backups of any collections before dropping them, and then proceeded to remove all references of OldDataSource throughout the application, as well as the migration code.

In Summary — Plan Thoroughly!

Shot by Chuttersnap

In some ways, these projects can feel like moving cargo between two cargo ships with neither one slowing down. User interaction with the application must remain unaffected while migrating — which can be quite difficult if the object being migrated is read from or written to with high frequency.

Large-scale data migrations can be challenging, but with proper and thorough planning, they can be accomplished without any users or other stakeholders noticing. That’s the #1 sign of a well-executed migration.

Compass is looking for experienced software engineers who are passionate about solving complex problems with code. We’ve taken a novel approach to building business software — focus on the end user — and it’s been working! Our users love us. Come help us build a product that makes the real estate process easy and rescue 10,000s of people from the jaws of clunky, outdated software.

--

--