Sync Gateway: Scalable, not Modular

Production Server for PouchDB & Sync Gateway

The second half of December and the greater part of January had been spent overhauling the architecture of the company’s web service to use PouchDB and Sync Gateway to create a better offline-first PWA experience. The process involved among many things building a Python class for automating server side view registration and user creation as well as passing access credentials to the client upon user authentication. The client also needed a JavaScript class to handle a sign out sequence which would flush the local db properly, all the while dealing with all the potential events which might create data loss.

The Problem

However, this morning I found myself melancholy about the decision to use Sync Gateway after it dawned on me all the issues that exist with its deployment. Mostly it is the problem of making Sync Gateway accessible to servers not running on the same instance or in the same VPC. There are no authentication protocols available for the admin port through which users, buckets and views are created. So, it is not intended to be used outside its firewall. See issue on StackOverflow. At the bare minimum, Sync Gateway needs to run next to a server that handles admin authentication if it is going to be used by any external service.

But as a stand alone service that is essentially just a document store, the response time of Sync Gateway is too slow to handle real time data. A performance benchmark times a single request to Sync Gateway in the range of 50ms. So, in order to get real-time responsiveness, requests would need to be made directly to Sync Gateway’s production server: Couchbase. But, the official python driver for Couchbase apparently doesn’t support python 3.5+ and libcouchbase will not compile on Ubuntu Xenial+. Couchbase also hogs a lot of system resources and Couchbase, Inc. removed its Community Edition AMI from AWS, so there is neither a convenient nor an affordable way to have Sync Gateway running as a remote service for an app which might be running on a different platform unless I wish to also build that service.

Based upon the timestamps of the open issues in GitHub to address these problems, as well as the commit history of PouchDB, Sync Gateway and Couchbase, it is as if the developers behind the Couchbase eco-system simply stopped working on it at the beginning of 2016.

The Sync Function

I have also been having misgivings with how the Sync Function is used in Sync Gateway. The Sync Function is essentially the only way to ensure that users have private documents and that malicious users cannot dump documents of unusual size or strange structure inside Sync Gateway. But the process of registering a Sync Function to prevent this is neither automated nor friendly when dealing with complex documents. There is also a strange bug in how previously registered Sync Functions are reported, which requires some nasty regex kung-fu on the part of any program which seeks to update a Sync Function.

So even though I was able to automate the addition of unique user roles for Sync Functions to the python class, there has yet to be time enough to build the recursive parser required to add a json-schema style validation to documents. The absence of which in the long term only increases the likelihood of producing many bugs and lots of CI snags. And, in the short term, means that I’m simply not validating documents thoroughly.

Note to Hackers

Not un-coincidentally, I also have some ideas regarding how one could exploit the setup-complexity of any app that’s running PouchDB and/or Sync Gateway in production.

  1. Make an unauthenticated request to PouchDB’s backend endpoint in case Sync Gateway is not being used.
  2. Try to access data that isn’t removed on the client between user sessions.
  3. Use a Sync Gateway account to store large third party files
  4. Send poorly formatted documents to Sync Gateway to try to crash any server side program that reads it.
  5. Try to connect to Sync Gateway’s endpoint on port 4985
RemoteStorage.io

But really the problem with the Couchbase eco-system lies in its decreasing usefulness compared to the expense of dealing with it. With Sync Gateway’s slow response time and lack of a true SQL style-upsert, allowing the server to update documents in it would introduce race conditions and possible data loss. So the PouchDB/Sync Gateway system is really just acting as a repository for files which are updated by the client and therefore isn’t much different than a file store. And, if that is the case, then RemoteStorage offers an alternative that places the burden of paying for that file storage onto the party that creates it and also has a robust json-schema style document validation system.

Synopsis

In the end, Couchbase might be scalable but it isn’t very modular. From a DevOps perspective, it is nothing like Firebase, or even AWS’s provisioned resource RDS, DynamoDB or S3. It forces all services that use it to run on the same vendor and it will likely produce additional employees just to maintain it. So, there needs to be a compelling offline-first experience to justify its only real purpose — cloud-base user data backup.