RDS or EC2 - what’s the best option for your DB?
The big cloud providers encourage you to use their database-as-a-service (DBaaS) offering when it comes to provision a database in the cloud. But is it always the best choice? And if not, what are the criteria for a DBaaS compared to a self-managed database?
In this post I will outline my personal experience supplemented by some internet search on the topic (see reference section) as well as the results of several discussions with Amazon RDS users. While most of my arguments should be vendor and tool agnostic, all details and calculations are based on Amazon RDS for MySQL.
Database services such as Amazon RDS are shifting responsibilities to the platform vendor thus freeing you from all system and certain database administration duties. As usual vendors such as Amazon requests a surcharge for this service. So let’s have a look at the cost RDS for MySQL cost structure and compare it to Amazon’s pricing for infrastructure level services like EC2, EBS, etc.
Depending on the instance type Amazon charges between 31% to 46% on top of the regular EC2 instance prices.
Typically one would select Multi-AZ for high-availability. Unfortunately, the hot standby instance cannot be use as read replica.
When you add read replicas to scale-out the RDS cost difference will further increase as “Read Replicas are billed at the same rates as standard DB Instances”.
Storage & Backup
The table below lists compares the prices for EBS (Elastic Block Storage) with the RDS storage costs.
For General Purpose SSD storage the difference is around 2 x $21,- per TB-month in a Multi-AZ environment. That’s rather decent even for large databases as large as Zalando (5 TB).
IMO the traffic costs are of minor relevance, assuming that
- your database will be accessed by your application servers only using private IP addresses
- database and application servers will be in the same availability zone; then traffic will be free - at least in the same region; then traffic will be $0,01/GB
The compute instance type seems to be the main cost driver. With your RDS billing sheet calculating the break-point between a in-house MySQL specialist and the RDS service should be straight-forward.
2. Missing Features
At the time of this writing Amazon RDS does not support all MySQL features. Below is an incomplete list of the most relevant missing parts:
Federated Storage Engine is currently not supported by Amazon RDS for MySQL.
The FEDERATED storage engine lets you access data from a remote MySQL database without using replication or cluster technology. Querying a local FEDERATED table automatically pulls the data from the remote (federated) tables. No data is stored on the local tables.
User-Defined Functions (UDF)
User-defined functions are compiled as object files and then added to and removed from the server dynamically using the CREATE FUNCTION and DROP FUNCTION statements.
To use UDFs you must be able to install object files in addition to the server itself. Unfortunately, Amazon RDS does not grant shell level access to the underlying compute platform.
3. Support for MySQL Versions
At the time of this writing Amazon supports several minor versions of MySQL 5.5 and MySQL 5.6 for new instances.
(…) as a general guidance, we aim to support new engine versions within 3–5 months of their general availability.
Here is a general statement of Amazon RDS’s deprecation policy:
* We intend to support major version releases (e.g., MySQL 5.6) for at least 3 years after they are initially supported by Amazon RDS.
* We intend to support minor versions (e.g., MySQL 5.6.21) for at least 1 year after they are initially supported by Amazon RDS.
Source: Amazon RDS FAQs
This update policy will cause a few months period before you can get advantage of fixes and features a new version provides.
At first glance the support period seems pretty fair. However, if you want to keep full control over the MySQL version in use by turning off “auto update minor version upgrade” you are forced to check and update to new MySQL versions approx. once a year - an extra effort that may add value to your business!
4. Optimize for Latency
Amazon recommends to use placement groups to optimize latency.
A placement group is a logical grouping of instances within a single Availability Zone. Using placement groups enables applications to participate in a low-latency, 10 Gbps network. Placement groups are recommended for applications that benefit from low network latency, high network throughput, or both.
This option is not available for RDS instances. However you can setup EC2 backed read replicas and add them into a placement group with your application tier servers.
For startups and low-traffic sites Amazon RDS and similar services provide a cost-effective alternative. Out of the box you will get a managed, high-available, easy-to-scale DB system with a point in time backup mechanism which will allow you to focus on other aspects of your business and product.
As database load increases you may reach a break-point where EC2 backed MySQL instances managed by own staff will be more reasonable - both from a cost and a performance tuning perspective.
Migrating RDS to EC2 got easier since Amazon supports replication to export data from a MySQL 5.6 RDS instance to a MySQL instance running external to Amazon RDS.