Zero downtime MySQL master/slave replication with Percona
I was recently approached by a provider of analytical and real-time fraud management solutions for the telecommunication industry after the fire in the data centre wiped out their database cluster. The incident left the analytical side of the business inoperable for days but luckily the master node survived.
At this point, I was presented with a challenge to re-establish a Master -> Slave setup without taking the master node offline.
The problem
To understand why this is a challenge, we need to look at how MySQL handles its replication on a high level.
At the gist of it, there is a binary log a special log file to which the Master node writes to through a dump thread and contains information on all data modifications made to a database instance.
When the slave node connects to the master node it will open two connections:
IO Thread — a process that receives a binary log sent by the master which is then copied over to a local log file called a relay log.
SQL Thread — a process that continuously reads the relay log and applies the changes to the slave node.
At the beginning of replication, the slave needs to be told from what binary log position it should start replicating the events from. This point is called a master log position and serves as coordinates at which the slave I/O thread should begin reading from the next time the thread starts.
Each change to the database also changes the master log position and since we can’t reliably extract the log position we cannot proceed with the slave setup. If you are coming from the enterprise-grade relational databases like Oracle or SQL server, etc. you are probably used to a hot backup.
A hot backup, also known as dynamic backup is a backup performed on a server while the database is actively running and accessible to users and is a standard way of doing most backups on enterprise-grade solutions.
The solution
Percona XtraBackup was a perfect candidate for the job. It provides an enterprise type backup feature to a non-enterprise RDS.
The Percona XtraBackup performs online non-blocking, tightly compressed, highly secure full backups on transactional systems so that applications remain fully available during planned maintenance windows.
The process
Setting up a slave for replication with Percona XtraBackup is a very straightforward procedure.
For the sake of simplicity, I will summarize the steps it took us to create a hot backup and properly re-establish replication on the slave.
Step #1 — Preparing data
At first we need to create a backup on the master node. We do so by executing the following command into a shell:
$ xtrabackup --backup --user={USERNAME} --password={PASSWORD} --target-dir={PATH_TO_BACKUP}
This will create a copy of your MySQL data directory to the directory we specified in the --target-dir
option. XtraBackup knows where the current MySQL data directory is by reading the my.cnf
. If your configuration file is located in non-standard location you may pass the proper location through --defaults-file=/path/to/my.cnf
.
Once the process finishes with the xtrabackup: completed OK! we need to prepare the data for the snapshot to be consistent. We achieve this by executing the following command:
$ xtrabackup --user={USERNAME} --password={PASSWORD} --prepare --target-dir={PATH_TO_BACKUP}
The process will apply transaction logs to the data files which are now ready to be consumed by the MySQL server.
Data preparation step ( — prepare) does not need to be executed on a master host you can run it on a slave as well.
Step #2 — Moving data to the slave
Now that we have our backup files let’s copy them to the slave.
$ rsync -avp --progress /path/to/backup {SLAVE_HOST}:/tmp/backup
Once the data is copied to the slave we need to move it to a proper data directory. Before doing so it is advised to backup the current data directory.
$ mv /var/lib/mysql /var/lib/mysql-bak
Now, let us move the copied back up to the MySQL data directory. You can do this by manually moving the files or if you have Percona XtraBackup installed on the server you may execute the following command:
$ xtrabackup --move-back --target-dir={PATH_TO_BACKUP}
If you are moving data to a different directory than the system default, make sure you set MySQL as a directory owner.
$ chown mysql:mysql /path/to/new/dir
Step #3 — Getting binary log file name and master position
To retrieve a binary log file name and position open the file xtrabackup_binlog_info
which is present in the backup directory.
$ cat /var/lib/mysql/xtrabackup_binlog_info// Output
master-bin.000001 481
In this post, I will not cover the additional setup of replication as I primarily wanted to focus on the solution to the problem. If you wish to learn the basic setup of replication you can follow the guide at DigitalOcean.
I would like to finish this post with quote “Data is the new gold.” as such we could argue that the core of the modern business is not the product but rather the data it can gather, and keeping it safe should be everyone’s top priority.