RDS Proxy and Connection Pinning

Andre De Camargo
6 min readJun 14, 2022

--

What is RDS Proxy and why use it

RDS Proxy is AWS managed service to efficiently managed a pool of database connections in a scalable way.
In a serverless based architecture independent components may have to interact with a database each having their own pool of connections and quite often battling to acquire connections to the database.
With RDS proxy each serverless component no longer communicates with the DB but instead send the connection requests to the proxy. RDS proxy is able to efficiently reuse connections from a pool and multiplex them between components as required.

How RDS Proxy works

RDS proxy keeps a pool of connections to the database in a “ready to use” state, it is constantly checking if the connection is ready, refreshing or creating new ones as needed.
When there is a request for a transaction the proxy pulls a connection from the pool and use it to run the transaction. By default a connection is reusable after a transaction (this is called multiplexing), this essentially means that the same connection might be used by completely independent components. Being able to reuse connections to run different transactions is what makes the RDS proxy efficient and allows it to handle more connections than what is support by the database. In practice RDS proxy will scale as needed and the bottleneck will be the database limits.

Transaction polling and Connection pinning

RDS proxy executes transaction polling by default, this is the most efficient way to utilize the database connections. A connection is borrowed from the pool, it’s used to run a transaction and released immediately after.
On the other hand when using session polling a connection is assigned to a client and then only release when the client explicitly closes it or it times out after being idle. Attaching a session to a client until it’s finished blocking it from being reused its what is called connection pinning.

The list of conditions that cause pinning is considerable, for example when using PostgreSQL we have this.
Not having connections pinned plays a key role for RDS proxy performance and managing the database connections efficiently.

It’s quite disappointing that AWS doesn’t provide a way to whitelist commands and avoid connection pinning. For example “set standard_conforming_strings” should be a global config and not connection specific, it shouldn’t cause connection pinning.

Additionally RDS proxy is not able to identify “SET” commands that have parameters already set in the database in such cases it pins the connections either way. For example if “standard_conforming_strings” is set to 1 in the database and a connection sends “set standard_conforming_strings” RDS proxy will pin the connection:

RDS Proxy logs
RDS PostgreSQL parameter configuration

How to track down connection pinning

RDS proxy have a “enhanced logging” configuration which allows to get more information over the proxy connections and operations. This option can be enabled under “Advanced configuration”:

On Cloudwatch there will be a log group with the proxy name and connection pinning messages are logged with WARN level will look like this:

2022-05-19T17:20:40.601Z [WARN] [proxyEndpoint=default] [clientConnection=2125203372] The client session was pinned to the database connection [dbConnection=1953961465] for the remainder of the session. The proxy can't reuse this connection until the session ends. Reason: SQL changed session settings that the proxy doesn't track. Consider moving session configuration to the proxy's initialization query. Digest: "set search_path to $1,$2,$3".

Node.js database libraries (Knex/Sequelize) and their impact on connection pinning

Knex is a well know multi-dialect query builder for node.js.
As of now when using knex all the connections to the database will be pinned due to the fact that knex use prepared statements and RDS pins connections that use prepared statements, see this comment from Knex maintainer.
Another reason for connections to be pinned when using knex is the usage of the searchPath parameter.

Sequelize is a ORM library for TS/node.js. Sequelize use by default the “set standard_conforming_strings” (see here) command for all connections forcing them to be pinned by RDS proxy. Worth mentioning that this is actually expected because having “standard_conforming_strings” set to off would allow SQL injections attacks.

How to avoid or mitigate connection pinning

  • For session configuration commands (e.g. SET …)
    As disappointing as it seems AWS doesn’t provide any alternative for PostgreSQL users, there is no way to whitelist SET commands and prevent connection pinning from happening. Therefore the only viable alternative its to remove those from your code which can be hard if you are using libraries that automatically use some SET commands (for example sequelize). Worth mentioning that RDS Proxy UI is confusing to users, it allows to use both the “initialization query” and the “session pinning filters” when using PostgreSQL engine but it doesn’t work for this engine. In their documentation they state that this feature is not available for PostgreSQL database.
    For non PostgreSQL users:
    It can be removed from library configuration and moved to RDS Proxy initialization query.
    Need to keep in mind that the initialization query is set per “target group” which maps 1 to 1 with the proxy instance therefore if we use a command like “SET search_path” the proxy will be tied to specific schemas
    Another option is to use proxy “Session pinning filters” which will force the proxy to not pin connections that use such statement. This can be used if you are sure that SET statements are global in your applications and do not impact specific transactions. A good example is the standard_conforming_stringswhich should be set to on.
  • Prepared statements
    For Knex there is no way to turn prepared statements off. Other libraries also use prepared statements everywhere (e.g. Prisma). Therefore nothing can can be done except avoiding using prepared statements when is not needed.
  • Usage of extended query protocol (PostgreSQL)
    One of the conditions that cause connection pinning in PostgreSQL is the extended query protocol. The main goal is to make the execution of queries more efficient, more details here. This feature is widely use on all main DB clients (for example node-postgres) and normally for obvious reasons there is no way to turn it off hence no workaround is available to avoid connection pinning when extended query protocol is used.
  • Utilize timeouts wisely
    Idle timeout can be set on both DB client side and proxy side. RDS Proxy has a “Idle client connection timeout” which is used to close client connections that are not actively executing transactions for a given period of time. This configuration can optimize the release of connections and ensure pinned connections return to the pool quickly for reuse.
    This is specially important when running the DB client in a lambda function. A Lambda container go to an idle state and the connections might be kept alive until the lambda container is destroyed.

Conclusion

RDS proxy main purpose is to enable database connections to be highly scalable leveraging the database capacity.
However in order to be fully performant RDS proxy requires the ability to use transaction polling and reuse connections among transactions.

RDS proxy has a considerable list of conditions for pinning, some of which include “SET” commands that (in main use cases) are never meant to be connection specific (such as standard_conforming_strings).

After working with RDS proxy for a while the main conclusion is that there is thin line between transaction polling and conditions for pinning, this line can be easily crossed when using well know DB client libraries and in such cases RDS proxy performance is jeopardized and the database connections capacity is under pressure. It is so easy to cause connection pinning that the regular behavior is to have pinned connections.

AWS should review RDS proxy pinning conditions and give more control over it, specifically to PostgreSQL engine users they do not provide workarounds to prevent connection pinning. Ideally they should allow to:

  • Whitelist SET commands that shouldn’t cause pinning
  • Not pin if a connection have a SET command for a parameter that is already set in the database
  • Don’t pin connections (users know what they are doing and want to have full control)

--

--

Andre De Camargo

@Vendr — Software Architect with focus on serverless, microservices and AWS