Comparison of MySQL across AWS, Azure and GCP

An exhaustive comparison of MySQL cloud Offerings

Hero Image by Suresh

RDBMSs(Relational Database Management System) have been a common choice of database used for financial records, logistical information, personnel data and other applications since the 1980s. Most databases in widespread use are based on the relational database model.

Why RDBMS?

On advent of NoSQL databases, developers have more choice and there starts the confusion. To put it in simple words, If the data to be stored is structured, has relations and needs to be consistent, then RDBMS is the answer. MySQL is one of the widely used relational database. There are tons of resources out there in the internet educating on differences between RDBMS and NoSQL. This post is exclusively on comparing MySQL across various cloud providers.

Why RDBMS in cloud?

Databases are critical to any application and managing them is a nightmare. A separate team of people has to patch the servers, update the MySQL versions, maintain the Master-Slave and read replicas, provide role-based access control(RBAC) to developers and create backups regularly. RDBMS cloud offerings are managed versions of your RDBMS databases that solves all the above problems. Here is the list of features provided by managed cloud RDBMS service:

  1. Flexible choice of CPU & RAM
  2. Storage
  3. Master-Slave Replication
  4. Read Replica
  5. Backup and Point-in-time recovery
  6. Data Import/Export setup
The rest of the post is all about comparing the above features across AWS, Azure and GCP’s RDBMS offerings.

Since each cloud provider supports various relational databases, we are going to compare the cloud offerings based on MySQL. Talking about the cloud ecosystem, each cloud provider offers a variety of services starting from compute, database to Machine Learning and IoT. But Compute and Database services are literally the deciding factor for adopting the specific cloud provider. I had a tough time understanding the differences of MySQL cloud offerings, so here is an exhaustive documentation of the above features across various clouds.

1. Available RDBMS, choice of CPU and RAM

I am going to talk about the relational databases offered by various clouds and their jargon of CPU and RAM. Generally, databases use more memory (RAM) than CPU. One of the deciding factor in choosing the right database instance is working set.

Working Set is basically the amount of data and indexes that will be active/in use by your system. Eg: Say, you have 1 year’s worth of data. For simplicity, each month relates to 1GB of data giving 12GB in total, and to cover each month’s worth of data you have 1GB worth of indexes again totalling 12GB for the year. If you are always accessing the last 12 month’s worth of data, then your working set is: 12GB (data) + 12GB (indexes) = 24GB. However, if you actually only access the last 3 month’s worth of data, then your working set is: 3GB (data) + 3GB (indexes) = 6GB.

Best Practice is to allocate enough RAM so that your working set resides almost completely in memory.
AWS vs Azure vs GCP based on CPU & RAM
P.S: Apologies for the poor table UI. Believe me, there is no proper way for drawing tables in medium. The above one was the only way to render multi-line table. Let me know in comments, if there is any other way for drawing tables in medium.

2. Storage

Storage and memory are often ambiguous terms in the cloud. Storage denotes the hard disk (HDD/SSD) and the memory denotes the RAM. Disk reads-writes are denoted as IOPS (I/O Operations per second). Higher the IOPS, lower the disk latency and vice-versa. Let us see all the available storage options:

AWS vs Azure vs GCP based on storage

Numerous components on a network, such as DNS servers, load balancers, databases and others can generate errors anywhere in the life of a given request. These erroneous requests should be re-tried by Exponential Back-Off method.

Exponential Back-Off is a standard error handling strategy for network applications in which a client periodically retries a failed request with increasing delays between requests.

Clients should use exponential back-off for all requests that return HTTP 5xx and 429response codes.

3. Master-Slave Replication

Master-Slave replication or failover replication is necessary to ensure high availability in any database. High availability feature is not a scaling solution. You cannot use a stand-by replica to serve read traffic. To serve read-only traffic, you should use a Read Replica. Let us see the failover options for MySQL instances across various cloud providers:

AWS vs Azure vs GCP based on High Availability

Failover process will introduce downtime, which should be handled by implementing exponential back-off requests in the client.

4. Read Replica

By Default, relational databases do not scale out (scale horizontally). But you can deploy read replicas to scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads. Read replicas are quite useful in case of small to medium analytics job. Let us see the read replica options for MySQL across various cloud providers:

AWS vs Azure vs GCP based on Read Replica

5. Backup and Point-in-time recovery

Backups are critical to any database. This helps to recover/restore your lost data, in case of any issues to your DB. Generally, cloud providers offer two types of backups namely Automated backup and On-Demand backup. Automated backups are taken every day/hour depending upon the cloud provider and can be restored to any desired point-in-time within the backup retention period. On-Demand backups are user-initiated snapshots of the given database. Let us see the backup and restore options for MySQL across various cloud providers:

Primer: A backup window is the time slot when it is most suitable to back up the database. backup retention period refers to how long these backup are kept for restore.
AWS vs Azure vs GCP based on backup

6. Data Import/Export

All cloud providers offer the traditional ways of importing/exporting data to/from the cloud database such as dump and load, replication technologies. It is up to the user to decide his/her convenient method to import/export the database data.

Best practice is to disable the automated backup and failover replication, which will improve/fasten the load time.

Summary

I have not compared the MySQL offerings based on price, as there is no common ground(same cores, RAM, features) across the cloud providers. Here are the advantages of each cloud provider with respect to MySQL:

AWS:

  1. More choice on the DB instance types such as standard, memory-optimized and burstable.
  2. Storage has PIOPS (Provisioned - IOPS) feature which allocates fixed IOPS irrespective of storage allocation.
  3. Cross-region read replicas which helps in global scale-out, disaster recovery and migration.

Azure:

  1. MySQL in azure is more like PAAS (platform-as-a-service), which has zero to few administration tasks.
  2. No need to deploy stand-by/failover replica, but ensures High Availability. Thus significantly reducing the cost of database.
  3. Automated backups are taken every 5 minutes, while in AWS and GCP, backups are taken once in a day.

GCP:

  1. Auto-scaling of database storage. No need to manually increase the storage when certain threshold is reached.
  2. Failover replicas can be used as read replicas, while in AWS and Azure failover replicas are only for high availability.
  3. Backup retention period is life time of the database compared to 35 days in AWS and Azure.

Thank you for reading. If you find something wrong or better ways to do it, let me know in the comments below.

If you like the post, hit the 👏 button below so that others may find it useful. You can follow me on Twitter.