Scaling Our Analytical Processing Service: Sharding a PostgreSQL Database with Citus

Markus Sintonen
Smartly.io Tech Blog
7 min readJun 26, 2019

Online advertisers are making more and more decisions based on performance data. Whether it’s about choosing which audiences or creatives to invest in, or enabling algorithmic optimization of campaign budgets, the decision making relies increasingly on having this data readily available. Our development team has built powerful tools to help our customers analyze performance data and make better decisions.

Our solution consists of highly customizable reports, including drill-down tables and graphs powered by our own, extremely flexible query language. The data service that powers the query language handles multiple terabytes of data. Besides serving as the backend for our user-facing analysis tools, it also powers all our automated optimization features and some of our internal BI systems. In this blog post, I’ll walk you through how we our solved scaling issues by sharding the database that the backend system uses.

Massive database equals scaling trouble

Our analytical data processing service, dubbed Distillery, uses a PostgreSQL database. The service handles safe transformation of JSON formatted queries into SQL queries which are finally run on the database level. Most data processing happens right in the database, so the Distillery backend mostly transforms our own query language into SQL queries. The original API queries are complex, which complicates some of the resulting SQL queries and makes them highly demanding on the database level. We weren’t surprised, therefore, when we encountered scaling issues during the development of the reporting system.

In the past, we had scaled our master-replica database architecture vertically, but it later became obvious that we were hitting the limits of this approach. Our database had accumulated almost 5TB of data during its three years in operation, and it had become unmanageable. Large size made update-heavy application writes slower and maintenance tasks difficult to perform. In the end, the biggest problem was that our data center wasn’t able to provide larger servers.

The solution: sharding the PostgreSQL database with Citus

When vertical scaling failed, we had to start scaling our reporting database horizontally. This meant that we needed to split up the data and processing between multiple database servers. We also had to shrink our huge database tables containing statistical data within each of these separate database instances.

This method of slicing the database data into smaller units is called database sharding. Our team decided to use the PostgreSQL Citus plugin to handle the sharding. It wasn’t the only option-we thought about using custom application-level sharding, but decided to go with the Citus plugin instead because:

  • We have a large number of complex queries that would require multiple different shards at the same time. The Citus plugin handles these complex queries automatically and distributes processing between the shards.
  • It also has wide support for PostgreSQL features that we need to run the complex reporting queries.
  • The extension makes shard management relatively easy so we don’t have to put much effort into managing the shard tables within separate database instances.

Citus is based on coordinator and worker PostgreSQL database instances. Workers hold the database table shards and coordinator plans the SQL queries in such a way that they can be run across the multiple shard tables among the workers. This allows distributing the large tables across multiple servers and into smaller, more manageable database tables. Writing to smaller tables is more efficient, since database index maintenance cost decreases. Also, the write load is parallelized and shared between the database instances. Citus solved two of our largest pain points: writes getting inefficient and vertical scaling coming to an end.

Database sharding with Citus comes with additional benefits as the new architecture accelerates our reporting queries. Some of our queries hit multiple worker instances and shards which Citus extension can optimize to run them in parallelized within different database instances. Also queries hitting only a single worker shard speed up thanks to smaller table indexes and more resources being available for query processing within separate workers.

Migrating the large database and the complex reporting queries into this type of sharded database architecture was by no means a small feat. It involved careful preparation and planning, which we’ll look into next.

Migrating to the new database

In the past, we ran reporting queries via our old PHP monolith. Already before the database scaling issues emerged, we started to build the newer reporting backend using Ruby on Rails. After deciding to handle SQL query migrations only in the new backend, we started to phase out the old one. This allowed us to optimize the new reporting queries specifically for Citus. It made the migration from application-level easier, because we only had to migrate this service to work with the Citus sharded PostgreSQL. The migration also involved a lot of prevalidation of performance and data correctness, but I won’t go into those details in this blog post. Let’s instead focus on database schema migration.

Sharded databases have certain requirements for the database schema. The schema has to have a value present that works as a sharding condition. This value is used by the sharding logic to distinguish on which shard the data is located. In Citus-PostgreSQL, the sharding is controlled using table primary keys. This composite primary key contains one or more columns where the first defined column acts as a shard value:

Here the account ID column is used as a sharding key, which means that we are distributing the data based on our customer accounts (single customer can also have multiple accounts). This means that a single account’s data is located within a single table shard. We had to make sure all our primary keys were in this format and that tables contained the account ID information. We also had to change some of our foreign key and uniqueness constraints since those must also contain the sharding column. Luckily, all these changes were safe to apply to the running production database without any performance or data integrity issues, although we had to engage in some rather more extensive database index rebuilding.

The second step was to get our SQL queries generated by our reporting backend compatible with the sharded database. First, the queries must contain the shard values in SQL WHERE clauses. This means, for example, that filters must take the form of

If we didn’t have the account_id condition in place, Citus distributed query planner would have no information from which shard it would need to find the relevant rows. Reading from all the possible shards would not be as efficient as reading from a single shard.

Furthermore, Citus has certain restrictions for what types of JOINs you can perform between sharded tables. Generally JOINs require the sharding column to be present in the JOIN condition. For example, this would not work:

It will result in the error:

ERROR: cannot run outer join query if join is not on the partition column

This means that SQL outer join requires a one-to-one match between the table shards which Citus can’t determine from the query. Therefore, the query needs to have the sharding column included in the JOIN condition from which Citus is able to detect that the scope of the ads table join is within one shard:

We did various other SQL query optimizations which allowed the Citus query planner to run our complex statistical reporting queries efficiently. For example, we organized our queries using Common Table Expressions (CTE), which allowed the Citus query planner to choose the best possible plans for heavy queries involving reading from multiple shards at the same time. These queries that hit multiple accounts are also highly parallelized across the Citus worker cluster, making the data processing more efficient. In addition, we did a contribution to the Citus extension that adds support for PostgreSQL JSON(B) aggregates which our reporting queries utilize for certain data pre-aggregation steps. You can check out the pull request in Github.

The new database system in operation

Our database system was fully migrated from the single master-replica configuration into coordinator + 4 worker servers each replicated for high availability. This means that our old database containing 5TB of data got sliced up into a cluster where each database server holds about 1TB of data. Citus allows us to fairly easily add more worker servers to slice it up even further when the company continues to grow. We can also isolate the most demanding customers who have tremendous amount of statistical data to their own database servers.

Database architecture before the migration.
Database architecture after the migration.

The diagrams above depict our database architecture before and after the migration. In total, we now have 10 database servers compared to the previous state where we had 2 massive database servers. These smaller database instances are easier to manage as most of the data exists in the separate database worker servers. Coordinators hold smaller amounts of data, like some metadata and data that is not sensible to shard. The second diagram also shows the database replicas which we use to ensure quick recovery if one database instance goes down. This failover from the primary master to the replica server is handled by the pgpool component. The replicas also share some of the read load from the primary servers.

Finally, our Pivot Tables reporting queries, which are the most demanding in terms of data processing, got a 2–10x performance boost from the new database system. Database queries generated by this feature are quite complex as we allow users to freely define how data is being grouped, filtered and aggregated. It also allows queries to freely run across shards as users can define any combination of accounts. Benefits of the Citus sharded database really show with these particular queries. The database migration was highly necessary as our old database infrastructure got almost overwhelmed by the complex queries it generated.

This graph shows the 90-percentile durations for certain types of queries getting a performance boost during the database migration project.

See what else our Engineering team is working on at smartly.io/developer.

Originally published at https://www.smartly.io.

--

--