How to encrypt AWS RDS MySQL replica set with zero downtime and zero data loss

Published in

The Startup

7 min readApr 23, 2020

Encryption of production databases can be a very challenging task. It is a very time-consuming operation and during that process, things may go wrong, which is the last thing that you want to do in a production environment. First of all, let’s look into the risks of doing such a procedure.

The Risks

There are two major risks of doing database encryption. The first one is the potential downtime and the second is the data loss during the potential downtime.

To encrypt a database, one must take a snapshot of it, encrypt it, and then restore it from that snapshot, which can take several hours to complete depending on the size of the database. During that process, your production database may receive additional data that will not be present on the snapshot and one must also have a plan to migrate all new data after successful encryption.

In AWS there isn’t a one-click way of doing RDS instance encryption and it always assumes the creation of a new RDS instance.

Let’s dig into our initial setup…

Initial Setup

In our initial setup, we will have one production database that will replicate all of its data into its replica which is running on read_only mode disabled.

The replica database is used as an analytics database. It has some analytics related tables where data is being inserted with AWS lambda and API gateway. The fact that production data is being replicated into that database will allow making complex analytics reports by joining analytics data with production data and create reports.

The main production database is an API database that serves all client applications.

We will break off all encryption process into 3 steps.

Step 1: Encryption of the database

The encryption of databases is pretty straight forward. There is an awesome tutorial here on how to do it, but before moving into that read this section till the end, here we have some IMPORTANT notes:

Complete all steps until step 13 included.
In step 12 the slave may have troubles connecting to the Master even if the all security group setup is done correctly. It turns out that if the RDS hostname is being used when adding an external master connection than the slave will not be able to connect to the master. To overcome this, you need to do any of the following actions:

Action 1 (my favorite)

Create an EC2 instance in the same VPC and then by using dig command find the private IP address of the RDS instance and use it instead. (Note that the IP address of the RDS instance may change suddenly, that’s why it is not shown in the RDS console). Although the private IP address is not very reliable, it is okay to use it, because we will get rid of this replication soon in the next steps.

Action 2( the easy path )

Allow MySQL port from 0.0.0.0 and open your database to the internet. This approach is a lot easier and with this case, the RDS_HOSTNAME can be used without a problem. But keep in mind that opening database access to the internet is not the ideal solution from the security perspective, so use this method at your own risk.

After creating the encrypted replica, go to the AWS RDS console and create a read replica for that instance.

Note: If the create read replica setting is disabled, it means that you will need to enable automated backups on your instance.

Disable the read-onlyoption for the created replicas. (This can be done using RDS DB parameter groups)

The first replica is going to be the new production database and the second one analytics.

At the end of this step we will have the following setup:

With this setup, you have real-time replication from your production database to encrypted replicas, and when switching from the production database to an encrypted production database you will not lose any data.

The next step we will be the most challenging. We will replicate certain tables from the analytics database to the second encrypted replica.

Step 2: Dealing with analytics

The only unique data that analytics database has is the data generated by AWS lambda which is stored in specific analytics tables, all other tables are production database tables that are being used just for joining inside SQL queries. Our main goal in this section is to establish a real-time replication connection between analytics database and encrypted analytics database (second encrypted replica) and replicate only analytics tables.

AWS RDS does not support the replication of certain tables from one RDS instance to the other. To overcome this we will be using AWS DMS (database migration service).

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

Navigate to DMS console then complete the following steps

From the Replication, Instances menu create a replication instance
From the Endpoints, menu create the following endpoints: source endpoint using an analytics database as a source and target endpoint using encrypted analytics database as a target
Complete the steps described in Using an Amazon-Managed MySQL-Compatible Database as a Source for AWS DMS

Note: Don’t forget to turn off the read-only option from the encrypted read-replicas

Before we begin the actual replication process it is important to know that AWS DMS will transfer the data, but it will not transfer indexes on the tables, autoincrement settings, and so on. More detailed documentation on what will be transferred can be found here.

To avoid such errors we will export table structure from the analytics table and create empty tables with all of the indexes being set up. This can be done using WorkBench or Navicat clients for MySQL.

Hint: This is a unique opportunity to add extra indexes to your tables if they are very big. In general, it will take a lot of time to add an index into an existing big table, but now all data will be inserted from scratch. So there will not be a better time to do so.

Now that all empty tables are created on the target instance it is time to start the replication.

From the AWS DMS console, go to the dashboard then create a database migration task. All the fields are pretty straightforward to fill. The most important things to know is the following:

Select Migrate existing data and replicate ongoing changes as Migration type. (This will ensure that your encrypted analytics database will stay in sync with the current analytics database)
Select Do nothing as Target table preparation mode. (Will preserve the tables we have created on the target database and will insert data into them.)
Enable CloudWatch logs (In case any error happens during the replication)
Enable validations (Will increase the overall migration time, but will ensure that data inserted to the target is identical to the data from source)

Now click the create task and the replication process will start.

Hint: CloudWatch logs will provide useful information in case of errors during the replication process

Hint: If somehow CloudWatch logs are not present in the CloudWatch console then this article will help you a lot.

At this point, we will have the following setup:

Now we have created the exact same initial setup with encrypted database instances, which are in sync with the original databases. It is time to switch databases.

Step 3: Switching the databases

To complete the final setup we need to point our API to the encrypted production database. Continue this tutorial from step 13 where we stopped on the first step.

Now the last thing we will do is to point analytics Lambda function to write to encrypted analytics database. Optionally after this step, you may remove the replication tasks and instances from AWS DMS.

At the end of this final step we will have the following setup:

As you can see this setup will not cause any database downtime and during the migration process, no data will be lost.

If you have read so far, congratulations, I hope it will help you in your projects.