Yet another AWS Graviton instances performance review

Alex Venediktov
Spin.AI Engineering Blog
7 min readDec 14, 2021
  1. Introduction.

2. Review of configurations that have been tested.

3. Test methods and scenarios.

4. Testing progress and results.

5. Charts and analysis.

6. Conclusion.

7. References.

1. Introduction

We want to decrease the cost of our growing infrastructure, without reducing the quality of service and its performance on the whole.

The goal was to ensure that the new instances with the AWS “Graviton” (next — Graviton) architecture and gp3 ebs disks can handle our databases at least on the level of the current x86-gp2 instances. Also we tried to find out peculiarities of the new architecture applied to our database management system (DBMS).

After the research we scheduled the migration of all the infrastructure (starting from database clusters) to the new architecture in AWS datacenter till the end of 2021. At the moment of publication, we already have migrated the first database cluster to the Graviton instances and are very pleased with results.

2. Review of the tested configurations

Our typical configuration of primary database instance in AWS cloud is m5.xlarge (4 Intel Cascade Lake cores, 16 GB of random access memory (RAM)) with 1500GB gp2 disk. XFS system has been used for data partition, it was mounted with “defaults, noatime” option. The system provides sufficient CPU’s and disk’s performance. The additional factor of using such a powerful instance in the AWS cloud is the disk bandwidth limitation, which Amazon applies depending on the instance’s type. Due to the structure of our databases and the peculiarities of our system — the quantity of RAM is also crucial for us. So we have prepared for testing instance configurations described in Sheet 1. Also, we have tried to test our current mainline database instance m5.xlarge with the new type of disks in two modes and some similar instances with the same volume of RAM and current “regular“ gp2 drive for reference.

Configurations for research. Sheet 1.
Configurations for research. Sheet 1.

All instances contain 16GB RAM and 1500GB EBS drive.

3. Test methods and scenarios

Since we are currently using PostgreSQL 12 as the main production DBMS we apply the built-in PostgreSQL performance test instruments pgbench [3].

All testing scenarios [4] and data sets have been based on the configuration of baseline (current) instance (m5.xlarge). See Sheet 2.

General testing scenarios. Sheet 2.
General testing scenarios. Sheet 2.

Postgresql 12 was prepared according to the best tuning practices [5, 6, 7], disk for pg_stat was moved to the RAM-drive.

The most notable performance tuning in postgresql.conf with some specific additions we made:

max_connections = 512
shared_buffers = 4GB
work_mem = 200MB
maintenance_work_mem = 1GB
effective_cache_size = 12GB
max_worker_processes = 2
max_parallel_workers_per_gather = 1
max_parallel_workers = 2
max_parallel_maintenance_workers = 1
random_page_cost = 1.1
huge_pages = try

We used Ubuntu Server 20.04 LTS (HVM) with the following sysctl.conf tuning:

fs.file-max=65536
kernel.exec-shield=1
kernel.randomize_va_space=1
vm.nr_hugepages=5120
vm.dirty_background_ratio=5
vm.dirty_background_bytes=67108864 (64 MB)
vm.dirty_bytes=536870912 (512 MB)
vm.swappiness = 1

After running the tests scenarios suggested in Pgbench usage scenarios recommendations [4] we tried to make short tests scenarios to compare the instance performance on very-tiny datasets with a normal load. We used tiny datasets (pgbench -i -s 60 benchmark) with a normal load (in the first 3 scenarios (Read/Write, Read-only, Simple write) and selectively re-test configurations described in Sheet 3. All configurations were supplied with gp3, 4500 IOPS EBS drives.

Configurations for additional short tests. Sheet 3.
Configurations for additional short tests. Sheet 3.

4. Testing progress and results.

We used an AWS cloud in the N. Virginia region for the test purposes, but we did not find any significant differences between the results in the regions (as we expected). We compare the number of transactions for each configuration and test case. You can see the averaged results in the number of operations per 600 seconds for “Read/Write,” “Read-only,” “Simple write,” and “Mostly cache read/write” scenarios in Sheet 4, as well as averaged results for “Buffered read/write,” “On-disk read/write,” “Heavy contention read/write,” “Heavy reconnection read/write” scenarios in Sheet 4.

Results of the additional benchmark scenarios with very tiny datasets are presented in Sheet 5.

Average results of benchmarks. Part 1. Sheet 4
Average results of benchmarks. Part 1. Sheet 4
Average results of benchmarks. Part 2. Sheet 4.
Average results of benchmarks. Part 2. Sheet 4.
Average results of additional tests on tiny datasets. Sheet 5.
Average results of additional tests on tiny datasets. Sheet 5.

5. Charts and analysis.

We supposed that in the Read/write test presented in Fig 1 below we had faced the bottleneck of our test system — and it was the disk system. However, it is a good test to compare the gp2 and gp3 disks without provision. Moreover, it is a good example that in some cases you don’t need so many cores for databases.

Additionally, we found that gp3 disks with provided 4.5k IOPS show a little bit lower performance in the Read-only test than instances with gp2 volumes (Fig 2). In the Simple write test, m6g.xlarge instance with boosted IOPS disk has very little advantage against all same instances (Fig 3).

Read/write test, normal load. — Figure 1.
Read/write test, normal load. — Figure 1.
Read-only test, normal load. — Figure 2.
Read-only test, normal load. — Figure 2.
Simple write test, normal load. — Figure 3.
Simple write test, normal load. — Figure 3.

The reduction of datasets in the benchmarks Mostly-cache (Fig 4) and Buffered (Fig 5) leads to the increase of the gaps between the instances. Thus, the data can fit the PostgreSQL/Linux file-system caches and the number of CPUs. The CPU cores’ performance becomes a significant factor.

Read/write test, low load, small dataset. — Figure 4.
Read/write test, low load, small dataset. — Figure 4.
Read/write test, low load, tiny dataset. — Figure 5.
Read/write test, low load, tiny dataset. — Figure 5.
On-disk Read/write test, low load, huge dataset. — Figure 6.
On-disk Read/write test, low load, huge dataset. — Figure 6.

The heavy contention case (Fig 7) shows almost the same relative performance gaps as the first three tests. However, now we can see that the 2-cores instances lack CPU performance against the 4-cores instances. But low IOPS boost has a more significant impact on whole database performance. We tried to reduce the test dataset for this test and found a performance gap between m5a and m5/m6g instances that confirmed our hypothesis.

Heavy contention, small dataset, high load. — Figure 7.
Heavy contention, small dataset, high load. — Figure 7.

The Heavy re-connection test brings some surprise (Fig 8). We got stably lower results for new Graviton instances than for x86 instances. In fact, we don’t know if it is the peculiarities of PostgreSQL for the ARM platform, or some lack of network performance, or specific Ubuntu system tuning for ARM. However, we consider them unexpected and intriguing.

Heavy reconnection, small dataset, high load. — Figure 8.
Heavy reconnection, small dataset, high load. — Figure 8.

As we expected, the additional testing with a tiny dataset shows that the “pure performance” of Graviton instances really exceeded the performance of x86 instances, or at least were at about the same level (Fig 9).

Very tiny dataset, normal load. — Figure 9.

Very tiny dataset, normal load. — Figure 9.
Very tiny dataset, normal load. — Figure 9.

6. Conclusions

We did not find any significant performance differences between large gp2 and IOPS/Throughput boosted gp3 disks. GP3 disks are providing the same (or almost the same) level of performance with a lower price and more flexibility than gp2 volumes, and this is a big advantage.

Summing up the above-described tests, we can provide the conclusions on the new instance performance as follows:

  • Better performance/value rate than for x86 instance, so you get the same or almost the same performance with a lower price per hour.
  • The slightly better performance than x86 in some cases.
  • The network performance on x86 instances is still better than on Graviton. (Possibly it is due to some differences between x86 and ARM version of Postgres, but in any case, we got a significantly lower result on the last test with heavy reconnection).
  • The migration to ARM architecture in some rare cases might be tricky, due to the lack of ARM support in the needed packages or modules.

And yes, we have already scheduled the migration of our database clusters to new Graviton instances and new gp3 disks for the nearest quarter.

7. References

1. AWS Graviton2 instance description. https://aws.amazon.com/en/ec2/graviton/ .

2. AWS gp3 EBS volumes announcement. https://aws.amazon.com/en/about-aws/whats-new/2020/12/introducing-new-amazon-ebs-general-purpose-volumes-gp3/ .

3. pgbench official documentation. https://www.postgresql.org/docs/12/pgbench.html

4. Pgbench usage scenarios recommendation. https://wiki.postgresql.org/wiki/Pgbenchtesting

5. Tuning PostgreSQL database parameter to optimize performance. https://www.percona.com/blog/2018/08/31/tuning-postgresql-database-parameters-to-optimize-performance/

6. A comprehensive guide on how to tune database parameters and configuration for PostgreSQL. https://www.enterprisedb.com/postgres-tutorials/comprehensive-guide-how-tune-database-parameters-and-configuration-postgresql

7. Optimize PostgreSQL server performance. https://blog.crunchydata.com/blog/optimize-postgresql-server-performance

--

--

Alex Venediktov
Spin.AI Engineering Blog

Ops/DevOps in Spin Technology. Ph. D of computer science.