Introduction

Nowadays every developer can’t imagine high-load production environment without a database. It becomes an iconic service for storing data. We used to install, manage and work with it every day, but still there are some dark spots for us. One of them is performance. Lots of articles had been written about database tuning, performance increasing, optimizing etc. This may be handy when you already have some database hosted, and you know that your database performance is not enough for your needs. This articles won’t help you if you need to launch a new database on cloud services like AWS, Azure, Rackspace etc.

I am sure those questions would be familiar for some of you, because I’ve been asked about it a lot:

  • Should I use AWS or bare metal for my Database?
  • What server type should I choose on Amazon?
  • Should I use Amazon RDS or EC2 service for my Database?
  • Should I use dedicated or shared tenancy on Amazon?
  • How many transactions such type of server can handle?

The goal of this article, is to answer the questions above. Of course there are no straight answer to any of those questions, and mostly it would be “depends on…” but anyway, I hope that my analysis would help you to choose a proper decision.

Testing environment

In the left corner of the ring I have an amazing bare metal server HP DL380 G9 with the next specification:

CPU: 16 cores (Dual Socket Octo Core Intel Xeon E5–2630v3 2.4GHz, #Processors: 2, #Cores per Proc: 8)
RAM: 128 GB
DISKS: 500 GB RAID 5 SSD

And on the right side of the ring I have two different Amazon Services: EC2 and RDS. To achieve the same behavior as bare metal I will use two database instances: DB1 (memory optimized) and DB2 (CPU optimized). The specifications are the following:
DB1 server:

r3.4xlarge (memory optimized)
16 cores
122 GB RAM
320 GB SSD Instance Storage

DB2 server:

c3.8xlarge
32 cores
60 GB RAM
750 GB io1 EBS 7500 IOPS

I also will be testing Dedicated and Shared tenancy, as well as EBS optimization for the instance where it’s not set by default (e.g r3.4xlarge).

Disclaimer:

  • I didn’t tune RDS or EC2 services. Used only default config files.
  • Results may vary depends on availability zones (AWS)
  • Amazon HVM virtualization has some overhead comparing to bare metal

Testing Conditions

Here I will describe all the testing conditions that I used during the tests.

  • Operation System for EC2 and bare metal: Ubuntu 14.04 LTS x64
  • EC2 instance types for testing: r3.4xlarge and c3.8xlarge
  • RDS instance types for testing: db.r3.4xlarge and db.m4.10xlarge (instance db.c3.8xlarge is not accessible on RDS, so I’ve replaced it with the slightly better tier)
  • Volumes for testing: 320 GB SSD for DB1, and 750 GB io1 7500 IOPS for DB2
  • MariaDB engine: 10.0.17
  • Sysbench version: 0.4.2
  • Sysbench test: OLTP complex test
  • Sysbench run command:
sysbench --db-driver=mysql --num-threads=$THREADS --max-requests=0 --test=oltp --mysql-table-engine=innodb --oltp-table-size=2000000 --max-time=60 --mysql-engine-trx=yes --mysql-user=$USER --mysql-password=$PASSWORD --mysql-host=$HOST run
  • Allowed incoming connections (my.cnf config file): 300
  • Threads: from 1 to 256
  • Sysbench host: localhost for EC2, same AZ (availability zone for RDS)
  • Results: transactions per second values from sysbench run
  • Sysbench tests are done in isolation of other tests

AWS “Best Practices”

Before jumping to the testing itself, I want to show you some Amazon best practice guides. Let’s treat this as a recommendation and compare to the results I’ve managed to get.

  • Use RDS for Database. For such services AWS uses very optimized virtual machines and servers (these services can be located in separate data centers from EC2)
  • Usage of dedicated EC2 instance should result in better performance, more information here.
  • Dedicated Instances are Amazon EC2 instances that run in a VPC on hardware that’s dedicated to a single customer. Your Dedicated Instances are physically isolated at the host hardware level from your instances that aren’t Dedicated Instances and from instances that belong to other AWS accounts.
  • In case of shared EC2 machines from other customers can suppress performance of your machines (that can be the case of your customer)
  • Use option “EBS-optimized” if available for EC2 instances (not all instance types have one by default) More information here.

Results

As I’ve told earlier — we have two fighters on the ring, so the results would be bare metal vs AWS. As we are using couple of Amazon Web Services there would be multiple tests:

  • EC2 and RDS based on DB1 performance compared to bare metal
  • EC2 and RDS based on DB2 performance compared to bare metal
AWS DB1 vs Bare Metal
AWS DB2 vs Bare Metal

So you just saw the results, and they are quite interesting. Let’s try to analyze them:

  • For both EC2 servers values look very similar, growing linearly till 16 threads, then it stops and stays almost on the same level even after increasing thread count.
  • EBS-optimization slightly increase transactions/second for higher number of threads.
  • Dedicated tenancy doesn’t change anything. Results are the same for dedicated and for shared tenancy (really unexpected)
  • “Compute Optimized” instances show a bit worse performance than “Memory Optimized” on both EC2 and RDS tests.
  • RDS on lower thread count behaves a little bit worse than EC2 or bare-metal and has a gap until 16 threads. Starting from 16 thread count it’s skyrocketing up with a big margin comparing to EC2. On 256 threads testing this value is tripled for RDS.
  • Bare Metal shows consistent results for lower threads, and loses to RDS starting from 128 thread count.
  • Bare Metal shows its best result on 32 thread count and degradates on higher values.

Summary

As you can see from the graphs above, EC2 instances are not performing well for high-load environments with huge number of connections. So answering the question: ”Should I use Amazon RDS or EC2 service for my Database?” — I will definitely say: “Depends on…”. If your production operates with high-load database and huge amount of connections you definitely should choose RDS. RDS shows really decent results comparing to bare metal, despite having a gap on lower thread count. But if your production uses a cluster system with couple of slaves and amount of threads is lower than 16, then it's better to choose EC2, as it operates slightly better on lower thread count.

Considering the question: “Should I use AWS or bare metal for my Database?” it’s also “depends on…”, just remember this notices:

The real strange thing during this tests was dedicated instances performing the same as shared ones. I’ve done couple of tests on different Availability Zones, but received the same results. So for now dedication — is not a performance boost for your instance, it’s security enhancement.

And yes, I still can’t answer the question: “What server type should I choose on Amazon?”, but I definitely can advise you to launch DB servers on RDS for high-load environments. If you want to use EC2, then Memory Optimized instances with EBS optimization will be the best of what you can get from it.