Logical replication with PostgreSQL 10

A primer with a simple example

Torbjørn Kristoffersen
HackerNoon.com
Published in
3 min readAug 6, 2017

--

In PostgreSQL 10 (soon to come out of beta) we see the introduction of logical replication, a long sought after feature.

Why logical replication?

Currently with 9.x, we have built-in streaming replication which works on physical WAL data.

By physical we mean that the replication operates on a block level instead of on a row level.

In other words, streaming replication copies raw block level changes to its standby servers, but with little understanding of what these blocks actually do to the database.

This now changes with logical replication, in which the data replicates on a logical level. This means the streaming replication now actually understands what these blocks do in terms of database behavior.

Instead of having standby servers that need to be an exact binary copy, we can now decide which tables we want to replicate and we also have the means to write data on the standby. These things weren’t possible before, unless one used third party replication software (all with their pros and cons).

This feature originates from the talented developers at 2ndQuadrant and their amazing work on pglogical. While not all of pglogical’s features made it to PostgreSQL 10, the fact that we now have official support for logical replication in PostgreSQL is a huge leap forward.

Architecture

Logical replication is based on publications and subscriptions. This is a common approach that is also used by other popular RDBMS such as Oracle and Microsoft SQL.

Pub / Sub Architecture of Logical Replication. A publication may be limited to just a few tables or all tables. One can achieve per-table sharding this way with multiple standby servers.

Basic Example

In this example we’ll set up two PostgreSQL 10 clusters on localhost. One runs on port 5432 (master) and the other on port 5433 (standby). The only modification to postgresql.conf is wal_level=logical. Do this on both clusters.

Assuming these clusters are now both running, we create some tables that we want to replicate from the master to the standby.

Let’s do this first on the master:

master=# create table foo (foo text PRIMARY KEY, quux text); 
create table bar (bar text PRIMARY KEY, quux text);
CREATE TABLE
CREATE TABLE

Now let’s create a publication for these 2 tables (again on the master). We also need a user with replication privileges, which we will call standby_user.

master=# create publication my_pub for table foo, bar;
CREATE PUBLICATION
master=# create role standby_user with login password '12345'
replication;
CREATE ROLE

And before we do anything, let’s add a few rows on the master.

master=# insert into foo (foo, quux) values ('hi', 'there');
INSERT 1
master=# insert into bar (bar, quux) values ('and', 'here');
INSERT 1

On the standby, we now set up the subscription.

standby=# create subscription my_sub 
connection 'host=localhost dbname=my_db
user=standby_user password=12345 port=5432'
publication my_pub;
CREATE SUBSCRIPTION

Once the subscription has been activated, we’ll see this in the standby’s log:

LOG:  logical replication apply worker for subscription "my_sub" has started
LOG: logical replication table synchronization worker for subscription "my_sub", table "foo" has started
...

On the master we’ll see that a replication slot has automatically been created and that initial sync is happening. On the standby we’ll see that both tables now contain 1 row each.

Note that logical replication still leverages the replication slots that you have been using in the old streaming replication, so the pg_replication_slots table should be used for monitoring.

Conclusion

We’ve seen that logical replication is fairly easy to set up, as long as there is a way to connect directly to the master from the standby via TCP.

For production systems replicating over the Internet, be sure to use ssl=on in Postgres. Not just due to security reasons, but also because you get compression which is enabled by default when SSL mode is used.

As with any functionality, it’s important dive deeper into the official documentation in order to identify restrictions and potential problems that may be relevant to your needs. See:

--

--