MySQL — Ongoing Replication using Data Migration Service from AWS

Jaque.
6 min readApr 7, 2022

When we have an application that constantly inserts data into the database, it is difficult for us to think about changing to a service in the cloud because we fear that when we migrate our data, it will be lost. So, for this article, we want to show you a simple way to do this migration using AWS services.

If we rely on an architecture we must think, that we want to capture ongoing changes to the source data store; to do this we are capturing while we are migrating our data. So we have something like this:

Ongoing replication architecture

But how we can do this? Well, AWS gives us two amazing services. The first will facilitate us the task of creating a database in the cloud (because we want instant access to our data wherever we are) and the second, a service that let us do an ongoing replication to migrate the data of our live application.

Requirements:

  • MySQL Workbench 5.6
  • MySQL Server 5.2
  • Amazon Relational Database Service (RDS)
  • Amazon Data Migration Service (DMS)

MySQL on Amazon RDS

First, we have to scale our relational database in the cloud. To do this test, we are going to use the free options that AWS gives us.

We have to set the specification for the instance we’re creating. Finally, we will have something like this:

Configuration MASTER -SLAVE to use the Change Data Capture (CDC) with MySQL

First, we’ve to configure our replication using the binary log position of replicated transactions. These permissions required to start replication on an Amazon RDS DB instance are restricted and aren’t available as a master user, so we must use the commands mysql.rds_set_external_master and mysql.rds_start_replication that Amazon gives us to set up the replication between our live database and the Amazon RDS database.

We’ve to set the binary logging format for the MySQL database, for that we’ve to update the binlog_format parameter to ROW and then reboot our DB instance for the change to take effect. The binary logging is in our mysqld.cnf file.

Configuring binary log file replication with a local master instance

  • Make the source MySQL instance read-only.
mysql> FLUSH TABLES WITH READ LOCK;
mysql> SET GLOBAL read_only = ON;
  • Determine the binlog location.
  • Copy the DB from the external instance to the Amazon RDS DB instance with mysqldump.
mysqldump --databases dms_sample --singletransaction --compress \ --order-by-primary -uroot -p | mysql --host=localhost --port=3306 \ -u northwindlocal -p
  • Make the source writable again.
mysql> SET GLOBAL read_only = OFF; 
mysql> UNLOCK TABLES;
  • We’ve to connect to the external instance and create a user to use for replication. We’re going to use this account for replication.
mysql> CREATE USER 'repl_user'@'localhost' IDENTIFIED BY '<password>';
  • For the external instance, we have to grant REPLICATION CLIENT and REPLICATION SLAVE privileges to our replication user.
mysql> GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'repl_user'@'localhost' IDENTIFIED BY '<password>';
  • Now we are going to replicate the instance. To do this, we’ve to connect to the Amazon RDS DB instance as the master user and identify the external database as the replication master by using the mysql.rds_set_external_master command.
CALL mysql.rds_set_external_master ('localhost', 3306, 'repl_user', 'root', 'mysql-bin-changelog.000010', 154, 0);
  • On the Amazon RDS DB instance, we’re going to start the replication calling this command.
CALL mysql.rds_start_replication;

Change Data Capture (CDC) with MySQL

Now we can set up a replication between and Amazon RDS MySQL and a
MySQL instance that is external to Amazon RDS.

As we saw earlier, we need to create the master-slave connection so that we
can do the replication of our database instance, to get this we’re going to
run the following command, and later we’re going to insert the output in our
slave instance.

We’re going to configure our slave from MySQL, in this case, our slave is the
Amazon RDS DB instance.

mysql> CALL mysql.rds_set_external_master ('localhost', 3306, 'repl_user', 'root', 'mysql-bin-changelog.000010', 154, 0);
mysql> CALL mysql.rds_start_replication;

Data Migration Service — Homogeneous migration

Creating the migration task

Full load plus CDC — Where the task migrates existing data and then
updates the target database based on changes to the source database.

In this part, we are configuring the task to takes all the tables of the source
database.

If you got this error, “Error: 1227 SQLSTATE: 42000 (ER_SPECIFIC_ACCESS_DENIED_ERROR) Access denied; you need (at least one of) the %s privilege(s) for this operation.”, you can follow this short steps:

  • Open the Amazon RDS console, and choose Parameter groups from the navigation pane.
  • Choose the custom parameter group name that is associated with your instance.
  • Enter log_bin_trust_function_creators in the Filter parameters field, and then choose Edit Parameters.
  • Change the Values for log_bin_trust_function_creators to 1.
  • Choose Save changes.

If everything goes right all the tables are going to migrate.

Once all the tables have been migrated successfully, Amazon Data Migration
Service starts replicating the data in real-time. Here we can see that the table person is empty, so we’re going to insert 249,755 rows of data in real-time.

We’re going to run a script that has 13.5MB of random data. Here you can
see how the data is inserted in real-time into the people table at both the
source and the target.

Source:

Target:

Rows inserted: 31,768.
Rows inserted: 77,722.
Rows inserted: 204,399.
Rows inserted: 231,088.

Our source table finished had 249,755 rows, as we can see below.

MySQL Workbench

Once the CDC finished, we can check if all the rows were inserted. To do this
we’re going to check it out from the target terminal or the RDS instance.

DDL and DML operations with CDC are also possible.

CREATE

CREATE TABLE `Prueba` (
`id` INT(7) NOT NULL AUTO_INCREMENT,
`nombreempleado` varchar(45) NOT NUL,
PRIMARY KEY (`id`)
)ENGINE=InnoDB DEFAULT CHARSET=latin1;

ALTER

ALTER TABLE `dms_sample`.`Prueba`
CHANGE COLUMN `nombreempleado` `nombre` VARCHAR(45) NOT NULL;

DROP

The DROP statement is not compatible with DMS, but we can DROP the table in our AWS instance.

INSERT

INSERT INTO `dms_sample`.`Prueba`
(`id`, `nombre`) VALUES (1, 'Jaqueline'), (2, 'Mike');

DELETE

DELETE FROM `dms_sample`.`Prueba`
WHERE id = 2;
DMS with DDL operations

Conclusion

We could see the advantages that AWS provides us on a relational database, if we want to set this up on our enterprise there would be a lot of benefits such as the lack of infrastructure management, instantaneous provisioning, scaling, profitability, application compatibility, high availability, and security.

Note that ongoing replication using the AWS data migration service can be extended to other databases such as PostgresSQL, Oracle, DB2, among others.

--

--