Merging PostgreSQL and RethinkDB with GraphQL

Published in

Collaborizm Blog

3 min readOct 24, 2017

The following example can be downloaded and run from The GraphQL-Merge GitHub repo.

At Collaborizm we’re happy RethinkDB users. But when it came time to build out our financial processing systems, we needed a database with transactions and aggregate function support. If you couldn’t tell from the title, we went with Postgres!

Migrating our app from RethinkDB to Postgres was not a viable option as our RethinkDB dependent code base is extensive and working well. So we decided to merge.

Slonik contemplating cross database merging strategies

The challenge

How do we connect the field user_id in Postgres to our Users table in RethinkDB?

Potential merging strategies

Syncing data

When every time a user is created or updated in RethinkDB we could perform the equivalent update to a users table in Postgres. We do this currently for our Algolia integration. The downside of this solution is the need to monitor that the sync is in place and fix the edge cases where something goes haywire.

Foreign data wrappers

FDWs implement the SQL/MED specification to bridge external data source. There is a foreign data wrapper for RethinkDB, which would let us treat our RethinkDB users table as a users table in Postgres. Unfortunately this solution has memory and performance implications.

API layer

Using a standard REST API, this could get tricky, but luckily we use GraphQL.

Merging data using GraphQL

For this example our data model is as follows:

Finance table — Postgres

idamountuser_idmessage

Users table — RethinkDB

idname… many more fields

We want to be able to select a list of financial transactions from Postgres and join our user data from RethinkDB.

GraphQL is a very explicit protocol, your client specifies which data it requires and your API describes what is returned.

Our GraphQL query is as follows:

{
  payments {
    id
    message
    amount
    user {
      id
      name
    }
  }
}

Our query asks for a list of payments and the associated user who made the payment.

Where the magic happens

GraphQL requires that you provide a schema for data being sent to the client. The schema can include how objects nested in records should be resolved via resolve functions.

const Payment = new GraphQLObjectType({
  name: 'Payment',
  fields: {
    id: { type: GraphQLID },
    user_id: { type: GraphQLString },
    message: { type: GraphQLString },
    amount: { type: GraphQLFloat },
    user: {
      type: User,
      resolve: ({ user_id }, _, ctx) => ctx.userLoader.load(user_id)
    }
  },
  date_created: { type: GraphQLString }
})

The resolve function for the Payment object uses the DataLoader library from Facebook. The library makes it very easy for us to batch operations. Every time there’s a row with a user_id, it defers the fetching of the users’ objects until all the users are ready to be fetched at once.

Complete code example

This code shows how to setup GraphQL with Express and and hook up the DataLoader library using a context. For brevity it uses mock data in object literals.

Merging RethinkDB and Postgres with GraphQL

Conclusion

Building large scale productions systems is going to be messy no mater what. I highly recommend tools like GraphQL which make standardizing interfaces to data easy and predictable.

If you enjoy talking tech and hacking, checkout Collaborizm; where you can work on projects with our community for fun and profit.