MariaDB cluster in hybrid cloud

David Gurevich
Feb 13 · 5 min read

I have been looking for a more robust distributed relational database where one instance is in an on-premises data center and the other is in AWS. Initially, the approach I tried was using MySQL replication between an on-premises server and an AWS RDS instance. While this works, there are limitations, like difficulty in supporting more than two nodes and the overall fragility of a database acting as both master and slave. In my case, the two instances are a long distance apart, so the solution had to be robust enough to deal with the replication latency. The solution also had to be open-source and run on Linux.

I narrowed down my choices to MariaDB Galera Cluster and MySQL NDB Cluster. Unfortunately neither one (cluster) is supported by AWS RDS, so I had to abandon it and run the AWS database in an EC2 instance. The NDB Cluster has its own database engine, has been around longer and has telecom background which is attractive from perspective of robustness. MariaDB cluster is based on the Galera implementation. It is a replication layer on top of the standard InnoDB engine and comes as a part of MariaDB server, so there is nothing extra to install. My application relies on some InnoDB features which are not supported by the NDB engine. This aspect convinced me to use MariaDB cluster.

Here is a quick overview of the steps involved in the setup:

  • Set up the on-premises and AWS servers
  • Install MariaDB in both
  • Configure cluster
  • Test cluster functionality and performance

I installed MariaDB version 10.2. It is tempting to use the latest version, but if you want to do database encryption and leverage the AWS KMS plugin, this is the version to get — later versions do not seem to have the plugin at this moment.

I used Ubuntu Server 18.04 as the base OS on both servers. My on-premises server is hosted by VMware. I downloaded the ISO file from the official Ubuntu download host and installed it. The AWS server is a standard Ubuntu 18.04 AMI.

Let’s assume that the on-premises server “node-1” IP address is 10.0.0.10 and the AWS server “node-2” IP address is 172.0.0.10.

Install MariaDB

Don’t forget that version 10.2 is needed in order to use the AWS KMS Encryption Plugin if you plan on using database encryption and want to store encryption keys in KMS. The installation will prompt you to set the database root user password. Don’t forget to set it and save it. If you don’t set it, you’ll have to run the mysql_secure_installation script later.

Here I am doing the installation on node 1. Repeat the same steps on node 2.

Check the installation.

Enter root password. You should see something like this:

Optionally increase maximum column size. Via mysql command line:

Or add the following to the /etc/mysql/my.cnf configuration file:

Check network configuration on both nodes. The following ports must be open and reachable:

  • TCP 3306 — MySQL clients
  • TCP&UDP 4567 — replication traffic
  • TCP 4568 — ICT (Incremental State Transfer)
  • TCP 4444 — SST (State Snapshot Transfer)

On AWS side verify the security group associated with the EC2 and VPC settings. Also check any internal firewalls that may be setup between the on-premises server and the VPC. Test connectivity between the nodes to confirm that they are able to communicate.

Node 1 configuration

1. Stop MariaDB

2. Edit /etc/mysql/my.cnf configuration file and add the following:

3. Start cluster

Here, we are not simply starting the MariaDB server, but are initializing the cluster. This is only done for the first node in the cluster. All of the subsequent nodes are started normally.

4. Check status

Output should look something like this:

Initial value is 1 since this is the first node in the cluster. When other nodes are started, this value will be incremented accordingly.

Node 2 configuration

1. Stop MariaDB

2. Edit /etc/mysql/my.cnf configuration file and add the following:

3. Start MariaDB

Here, we are just starting MariaDB normally. The server will join the cluster based on configuration settings.

4. Check status

You should see the following output:

The value should change to 2. Every time a new node is added to the cluster, the value should reflect the total number of nodes.

Test cluster replication

Login to the on-premises server, add a database, table, and some data.

Verify that the changes propagate to the AWS server.

In my case, the latency is not noticeable. With large amounts of data, the delay may cause issues. MariaDB cluster documentation recommends running an odd number of nodes to break transaction deadlocks. The idea is that there is always a “majority” to resolve the deadlock. In practice, this may cause a rollback error which manifests itself as OperationalError: (1213, ‘Deadlock found when trying to get lock; try restarting transaction’) in the database transaction log. According to Galera documentation the general approach to solving these problems is by implementing application layer retry logic: WORKING AROUND MULTI-MASTER CONFLICTS.

References

What is MariaDB Galera Cluster

Getting Started with MariaDB Galera Cluster

Unsupported or Missing Features in NDB Cluster

David Gurevich

Written by

I’m a software developer in various roles involved with web apps, cloud services, networking, as well as embedded systems and wireless protocols.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade