Why Postgres RDS didn’t work for us (and why it won’t work for you if you’re implementing a big data solution)

4 min readJan 15, 2024

Background

We were building a startup that measured marketing effectivness via a statistical attribution model. For this to work we had to collect a ton of online user behavior. Every click, every field input, every action. Data was stored in postgres and end users, via our UI, ran dashboards and reports that crunched the above (sometimes very large) data sets. Important to note that we were building a multi-tenant SAAS application, so data was being ingested and stored per tenant across hundreds of tenants. It needed to be scalable and performant. We started with AWS RDS managed postgreSQL.

The big problem is with EBS: throughput opacity

It’s almost impossible to accurately measure throughput on EBS. The metrics AWS gives you are latency and IOPS. Neither of these provide a measure of throughput. When you’re storing a large time-series table (say 20 million rows) and the size of the table on disk is 10GB and you want to scan the table (for whatever reason — say a sequential scan) then you need an accurate measure of throughput to understand how long it will take to scan the data from disk.

Bursting and throttled IOPS — the slippery slope

Queries like the one above quickly use up IO credits. We want more IOPS so what do we do? Naturally we increase the storage volume so we can get a higher baseline of 10000 IOPS from the measly 3000 you get with smaller volumes. But then backup costs increase because you’re paying for the entire volume, not just what’s used. Costs increase, performance doesn’t get much better — it snowballs from there.

We also kept uping our instance size to get more memory for caching. It’s easy to make the DB fast if it can fit it all in memory. There was also the promise of greater parallelization that came with PG v10, but sadly that brought us very little benefit due to the IO throttling.

The problems with measuring throughput are described in more detail here:

https://www.datadoghq.com/blog/aws-ebs-latency-and-iops-the-surprising-truth/

Aurora

So then we migrated to Aurora. Promises of 5–7 X performance increases over standard postgres are hard to ignore. However the caveat is that these performance metrics are based on pgbench (in the case of postgresql). That’s great if you have an OLTP system and want to measure the speed (or latency) of transactions with a large number of parallel clients but is not a good performance indicator for large data-sets and data-mining operations.

Costs went up instead of down.

This thread covers IOPS and billable IOPS https://forums.aws.amazon.com/message.jspa?messageID=835303#835303

Again the biggest frustration was getting accurate measurements of throughput. Simply dividing the amount of data scanned by the time it took (using tack_io_timing) we saw that performance for a large dataset was nowhere near SSD speeds, more like magnetic disks.

Adding to the frustration was the endless hours dealing with support. While support staff were always very knowledgeable and eager to assist it always came down to tuning the query or tuning the DB. Again this came down to throughput — I would’ve loved to hear something along the lines of “This service is not a good choice given your workloads”. But eventually we figured that out for ourselves.

The Solution: Roll your own

For us the solution was fairly simple: don’t use a managed database services and roll our own infrastructure on EC2.

We did this using relatively low-grade EC2 instances (We were running 4XL on RDS) xl and 2xls.

The key was to separate READ and WRITE workloads between a master and standby replica that was kept up to date using WAL-G (https://github.com/wal-g/wal-g)

The WRITE node could be slow, it’s only ingesting data, so EBS was good enough but we were able to DOUBLE EBS throughput using ZFS with compression.

For the READ node we had a setup with ephemeral NVMe drives, and again ZFS to boost performance.

The results speak for themselves:

$11K bill down to a $2100 (projected for September which will the first full month we’re clear of any RDS baggage)

Queries that would take hours or timeout altogether run in seconds.

We are able to measure actual throughput using command line tools on the unix instances (zpool iostat)

Conclusion

Sure it’s easy to hit a button and spin up an instance and never worry about backup and recovery. But that comes at a cost, both monetary and performance in our case. If you need REAL speed for crunching TBs or GBs of data then use bare metal, and if you can’t use bare metal then the next best thing is EC2 with NVMe drives. Read part two where I provide the recipe for how to spin up your own Postgres Cluster on AWS using instances with NVMe drives.