Managing persistent storage for PostgreSQL in Amazon EKS

Jeffrey Wang
CloudX at Fidelity
Published in
5 min readApr 9, 2021
Image Source: Shutterstock

According to a recent Cloud Native Computing Foundation (CNCF) survey, 83% of respondents are using Kubernetes in production; 55% of respondents are using stateful applications in containers in production, and 33% plan to autoscale their stateful applications. Therefore, understanding scalability options and availability impact during scaling are critical for development teams to manage applications in a Kubernetes cluster. In Kubernetes, users can use autoscaling features to manage stateless applications, but how do we scale the persistent storage for stateful applications?

The nature of Kubernetes is to support stateless applications; it is a well-known pattern and well documented. It is relatively straightforward to design a highly available stateless application on Kubernetes. However, many applications have to support stateful functions. StatefulSet is the standard implementation of stateful application management in Kubernetes. Users can use StatefulSet to manage stateful applications in Amazon EKS as well.

There are many database solutions external to EKS-hosted databases, but when an EKS cluster is managed by a separate team and/or in a separate cloud, this can add complexities for the development teams. Some users may not need all features provided by database solutions.

From a cost perspective, development teams may be able to save some money when managing a database in managed Kubernetes (e.g. AKS, EKS). The table below compares the cost difference between managing Amazon Aurora PostgreSQL and PostgreSQL database in Amazon EKS.

For Aurora PostgreSQL database, the cost was calculated with single t3 large instance type, with 100 GB Storage, 100 reads per second and 100 write per second:

Calculated February 2021

Managing persistent storage

Developers can manage the persistent storage of PostgreSQL in Amazon EKS through Kubernetes manifest, and a separate Amazon EBS volume will be created for the persistent storage. There are two approaches to manage the persistent storage: request a small size and resize the storage when needed, or request a large storage without worrying about reaching capacity. In order to compare these two approaches, two aspects need to be considered — cost and impact.

In order to understand the cost difference between the two management approaches, a basic use case is needed for comparison. The performance tests executed in this article provisioned 6 gibibytes persistent storage for initial testing. The size of persistent storage was updated to 12 gibibytes during resizing, so 6 gibibytes is used as the initial size to calculate the cost, and 6 gibibytes of storage space needs to be added every month. The performance tests used General Purpose SSD (gp2) Volumes; users need to pay $0.10 / GB-month of provisioned storage.

When provisioning a small initial storage and increasing the size when needed, users only pay a few dollars in the first few months, and then only pay for additional storage when they need it. Alternatively, when provisioning large initial storage, users have to pay the same monthly fee; the storage space usage rate is very low at the starting point.

Cost calculated February 2021

Based on the diagram, users need to pay more when provisioning a large enough persistent storage. Although only a small part of the storage space is used initially, the user needs to pay for the provisioned storage every month. When users initially provision small persistent storage, they need to take extra steps to increase storage when needed, but they only need to pay (approximately) for the space they use.

Impacts

Users can manage PostgreSQL database with StatefulSet. Multiple performance tests were executed to evaluate the performance of the PostgreSQL database in Amazon EKS and the availability impact of resizing the persistent storage.

Longevity testing was performed in the following scenarios:

1) When there was enough capacity (40% — 50 %)

2) When the persistent storage reached capacity (about 90 %)

3) During resizing of persistent storage

4) After resizing

For all of these scenarios, multiple tests were performed, and the performance tests were conducted at different times and dates.

Testing conducted September 2020

According to the results, when StatefulSet was used to create persistent storage for a PostgreSQL database, the database was stable during the period of persistent storage reaching capacity and resizing, and no transaction was lost during any performance tests. The average latency of two tests was more than twice that of other tests, but the network connection was very slow during those two performance tests, so the performance degradation was not caused by resizing. Overall, there was no impact on performance and availability when persistent storage reached capacity, and no impact on performance and availability during resizing of persistent storage.

Conclusion

The StatefulSet relies on persistent storage drivers to modify the size of the persistent volume. With the default storage class, Amazon EBS volume of PostgreSQL database in Amazon EKS could be resized. The PostgreSQL database was stable during the resizing process, and all transactions in the performance tests were successfully completed. The performance test results were varied; sometimes the performance was better when reaching capacity and during resizing. Users can use most of the storage space in Amazon EKS and resizing persistent storage when needed as the performance and availability of PostgreSQL database were not impacted during the resizing process.

Overall, users can provision a large persistent storage for the PostgreSQL databases at the beginning, but they will need to pay more. Users can also provision a smaller persistent storage for the PostgreSQL database and resize when needed. However, autoscaling of persistent storage is not supported by Kubernetes and Amazon EKS, so users may need to write custom script to resize the persistent storage.

Source: CNCF survey report

#fidelityassociate

--

--

Jeffrey Wang
CloudX at Fidelity

Principal engineer within Enterprise Cloud Computing at Fidelity Investments. I am passionate about governing the cloud at scale.