MySQL Slave Scaling (and more)

At, we have very wide replication topologies. It is not uncommon to have more than fifty (and sometimes more than a hundred) slaves replicating from the same master. When reaching this number of slaves, one must be careful not to saturate the network interface of the master. A solution exists but it has its weaknesses. We came up with an alternative approach that better fits our needs: the Binlog Server. We think that the Binlog Server can also be used to simplify disaster recovery and to ease promoting a slave as a new master after failure. Read on for more details.

When having many slaves replicating from the same master, serving binary logs can saturate the network interface of the master as every change is requested by every slave. It is not unusual to have changes that generate lots of binary logs, two examples are:

  1. deleting lots of records in a single statement when using row based replication
  2. executing an online schema change on a large table

In the replication topology described in Figure # 1, producing one megabyte of binary logs per second on M will generate replication traffic of a hundred megabytes per second if we deploy a hundred slaves. This is very close to the limit of a 1 Gbit/s network interface and this is something we see happening in our replication chains.

Image for post
Image for post
Figure # 1: Replication Tree with many Slaves.

The traditional solution to this problem is to place intermediate masters between M and its slaves. In the deployment of Figure # 2, instead of having slaves directly connected to M, we have intermediate masters replicating from M with slaves replicating from each intermediate master. When having 100 slaves and ten intermediate masters, this allows producing ten times more binary logs on M before saturating the network interface (ten MB/s instead of one MB/s).

Image for post
Image for post
Figure # 2: Replication Tree with Intermediate Masters.

However, using intermediate masters has its problems:

  1. replication lag on an intermediate master will generate delay on all of its slaves
  2. if an intermediate master fails, all of its slaves stop replicating and they need to be reinitialized [1]

Diving deeper on the second problem in the context of Figure # 2, one could think that, in case of a failure of M1, its slave could be repointed to the other intermediate masters but this is not that simple:

GTID can help us in repointing slaves, but it will not solve the first problem above.

It should be noticed that we do not need the databases in the intermediate layer at all: we only need to serve binary logs. And, if the binary logs served by M1 and M2 were the same, we could easily swap each of their respective slaves. From those two observations, we build the idea of the Binlog Server.

Binlog Servers replace intermediate masters in Figure # 2. Each Binlog Server:

And of course, if a Binlog Server fails, we can simply repoint its slaves to the other Binlog Servers. Even better, since these hosts do not apply changes to a local dataset before serving them downstream, latency is greatly improved in comparison to using actual database servers.

We are working in collaboration with SkySQL to implement the Binlog Server as a module to the MaxScale pluggable framework. You can read this blog post by SkySQL for an introduction on MySQL replication, MaxScale and the Binlog Server.

Other use-case: avoiding deep nested replication on remote sites

The Binlog Server can also be used to reduce the problems with deep nested replication on remote sites.

If someone needs four database servers per site on two sites, the topology from Figure # 3 is a typical deployment when WAN bandwidth is a concern (E, F, G and H are on the remote site).

Image for post
Image for post
Figure # 3: Remote Site Deployment with an Intermediate Master.

But this topology suffers from the problems explained above (replication delay on E slows down F, G and H, and losing F, G and H in case of a failure of E). It would be better if we could set it up like Figure # 4 but it needs more WAN bandwidth and, in case of a disaster, on the main site, the slaves of the remote site need to be reorganized in a new tree.

Image for post
Image for post
Figure # 4: Remote Site Deployment without Intermediate Master.

Using a Binlog Server on the remote site, we can combine the best of both solutions (low bandwidth usage and no delay introduced by an intermediary database). The topology becomes the following:

Image for post
Image for post
Figure # 5: Remote Side Deployment with a Binlog Server.

The Binlog Server (X) looks like a single point of failure in the topology from Figure # 5 but if it fails, it is trivial to restart another one. It is also possible to run two Binlog Servers on the remote site as illustrated in Figure # 6. In this deployment, if Y fails, G and H are repointed to X. If X fails, E and F are repointed to Y and Y is repointed to A.

Image for post
Image for post
Figure # 6: Remote Site Deployment with two Binlog Servers.

Note that running Binlog Servers does not necessarily mean more hardware. In Figure # 6, X can be installed on the same server as E and Y on the same server as G.

Finally, those deployments (with one or two Binlog Servers) have an interesting property: if a disaster happens on the main site, the slaves on the remote site will converge to a common state (from the binary logs available on X). This makes reorganizing the slaves in a new replication tree easy:

Other use-case: easy high availability

The Binlog Server can also be used as a high availability building brick. If we want to be able to elect a new master in case of the failure of A in Figure # 7, we can deploy GTIDs or use MHA but both have downsides.

Image for post
Image for post
Figure # 7: Replication Tree with 6 Slaves.

If we deploy a Binlog Server between the master and the slaves as illustrated in Figure #8:

Image for post
Image for post
Figure # 8: Replication Tree with a Binlog Server.

If we want to combine extreme slave scaling and high availability, we can use the topology of Figure # 9.

Image for post
Image for post
Figure # 9: Replication Tree with many Binlog Servers.

If a Binlog Server fails, its slaves are repointed to the other Binlog Servers. If A fails:

Image for post
Image for post
Figure # 10: Converging Slaves after the Failure of the Master.


We have presented a new component that we are introducing in our replication topologies: the Binlog Server. It allows us to horizontally scale our slaves without fear of overloading the network interface of the master and without the downsides of the traditional intermediate master solution.

We think that the Binlog Server can also be used in at least two other use-cases: remote site replication and easy topology reorganization after a master failure. In a follow up post, we will describe another use-case of the Binlog Server. Stay tuned for more details.

[1] Slave reinitialization can be avoided using GTIDs or by using highly available storage for intermediate masters (DRBD or filers) but each of those two solutions brings new problems.

Would you like to be an Engineer at Work with us! Infrastructure

Core Infrastructure in

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store