Boost serverless app performance with Amazon RDS Proxy and Amazon Aurora
Once the RDS Proxy service was launched, I started together with Felipe Mejia to test the service and see how it could be used in serverless architectures. We started by testing several lambdas functions executing simultaneous write tasks on an Aurora MySQL database and how RDS Proxy would help us in these types of scenarios. After several tests and metrics, I want to share what I have learned.
RDS PROXY is a fully managed AWS service that serves as a proxy layer between the application layer and the database layer, which takes care of:
- Pools and shares database connections
- Increases app availability.
- Improves data security.
However, when we started working with serverless architectures and relational databases (Aurora MySQL), we encountered some interesting challenges:
- DB Performance and Connection Management: By having simultaneous connections, our database has to use its computing resources to manage these connections and support their escalation in the event of a traffic peak.
- Failover Time: When having critical applications in serverless architectures, availability will always be a priority, for this reason, we must seek an efficient cost relationship between services and availability.
- Security: How to handle connection string, username, and password for the database through multiple lambda functions?
Typical Serverless Architecture with RDS Proxy
Connection Management
- Multiplexing: the proxy can reuse every connection after a transaction in your session. this level of transaction-level reuse is called multiplexing.
- Borrowing: It happens when the RDS Proxy removes a connection from the Pool to reuse it. Once finished, he returns it to the pool.
- Pinning: In some cases, the RDS proxy is not sure if it can reuse a connection outside the session, in these cases the session is kept on the same connection until the session ends.
Failover Time
Failover can happen when you have a problem with the master instance, for an update, or a connectivity problem. During a failover, the RDS proxy continues to accept connections from the same source and automatically directs them to the new instance that will act as the master instance.
During these failovers the clients will not be susceptible to:
- Domain Name System (DNS) propagation delays on failover.
- Local DNS caching.
- Connection timeouts.
- Uncertainty about which DB instance is the current writer.
- Waiting for a query response from a former writer that became unavailable without closing connections.
Security
RDS Proxy supports TLS protocol version 1.0, 1.1, and 1.2. You can connect to the proxy using a higher version of TLS than you use in the underlying database.
For the lambda function to connect to the database, everything must be done through the Secrets Manager service, where there is a secret that is configured in the Proxy. At the lambda level, we point it to the RDS Proxy.
Tests Executed
Infrastructure Deployment: The entire infrastructure was deployed with CDK + TS, it was an interesting challenge to deploy the entire infrastructure like that, however, the documentation for TS is not as complete when compared to CDK + Python.
Database Configuration
Aurora Master Database Instance
Aurora Standby Database Instance
Lambda Function
It was developed in Python 3.7, it is really a script that through two cycles does recurring writing tasks in the database and in turn shows when it is writing each of the records in the database:
Now within the described scenario with all the lambda functions doing writing tasks on the database, Failover was executed in 2 scenarios, in each of them a total of 15 tests were run:
- Failover without RDS Proxy: In this scenario, we had database unavailability of 10–12 seconds while failover.
- Failover with RDS Proxy: in this scenario, there were only 30% of the tests with the unavailability of 1 second, in the remaining 70% there was no unavailability.
After the tests in our scenario, we evidenced a 90% improvement in failover over time.
Learned Lessons
- Whenever AWS releases a new service, I suggest waiting 3–6 months to use it in production environments. Since the documentation is usually not complete, AWS support does not know the service well and you can have a hard time trying to do complex things.
- If you are going to implement RDS Proxy, take into account the costs and how they can impact the project.
- For productive environments, I definitely recommend the use of a proxy to improve security, connections, and failover.
- During tests performed, the failover time was reduced by 90% and in several cases, there was no unavailability.