Scheduled Postgres to Redshift copy with a simple Heroku App

Zeneixe
Carwow Product, Design & Engineering
2 min readAug 8, 2016

Previously I told you why we use Redshift at carwow for faster business intelligence. I’m now going to show you how to achieve that with little effort.

Our solution runs a very simple Rake Task every night using Heroku’s Scheduler.

If you need something more elaborate you can try (paid) services like XPlenty and Flydata. For us it didn’t work out because we figured the maintenance required was higher than the time saved writing our own very simple application.

You will need

  • A verified Heroku account
  • An Amazon S3 bucket
  • A Postgres database
  • A Redshift database

Let’s go

First of all create a new folder on your terminal:

mkdir sample_heroku_to_redshift

Inside the new folder create a Gemfile with this content (NB: right now the latest version of postgres_to_redshift is 0.1.2):

source 'https://rubygems.org'ruby "2.3.1"gem 'rake'
gem 'postgres_to_redshift'

You’ll also need to create a simple Rakefile:

require 'postgres_to_redshift'desc "Imports all tables to Redshift"
task :import do
PostgresToRedshift.update_tables
end

Now let’s create and deploy a new Heroku app:

$ git init
$ heroku apps:create sample-heroku-to-redshift
$ git add .
$ git commit -am "First Commit"
$ git push heroku master

Now we’ll have to configure the App:

$ heroku config:set POSTGRES_TO_REDSHIFT_SOURCE_URI='postgres://username:password@host:port/database-name' \
POSTGRES_TO_REDSHIFT_TARGET_URI='postgres://username:password@host:port/database-name' \
S3_DATABASE_EXPORT_ID='yourid' \
S3_DATABASE_EXPORT_KEY='yourkey' \
S3_DATABASE_EXPORT_BUCKET='some-bucket-to-use'

Remember SOURCE_URI is your Postgres DB, TARGET_URI is for your Redshift DB.

Now you can test run the Postgres to Redshift copy with this command:

$ heroku run rake import

Obviously we don’t want to do this by hand so we can configure Heroku’s Scheduler to run it for us. First create the add-on:

$ heroku addons:create scheduler:standard

Then configure it via web to run this command at your chosen interval (I run mine nightly):

rake import

Easy peasy lemon squeezy!

Final notes

We’ve actually got our own fork of the postgres_to_redshift gem, because we needed to copy from different Postgres databases to the same Redshift database with named target schemas. Our gem is undocumented and available only on this git repository. If I get enough requests I’ll add some documentation and examples.

Interested in making an Impact? Join the carwow-team!
Feeling social? Connect with us on Twitter and LinkedIn :-)

--

--