ElastiCache client connection handling

Client connection handling suggestion: Connection Pooling, Pipelining, Timeout and Backoff retry setting. 說明前端應用在處理連線的建議、及客戶常見的問題。

Jerry’s Notes
What’s next?
7 min readMar 20, 2022

--

Connection pooling

Redis Clients Handling: https://redis.io/topics/clients

Connection pooling means that connections are reused rather than created each time a connection is requested. To facilitate connection reuse, a memory cache of server connections, called a connection pool, is maintained by a connection pooling module in the supported client application. Connection pools create a set of connections which you can use as needed (and when done — the connection is returned to the connection pool for further reuse).

Connection Pooling的原理,通過預先創建多個連接,當進行Redis操作時, 直接獲取已經創建的連接進行操作,而且操作完成後,不會釋放,用於後續的其他Redis操作,這樣就達到了避免頻繁的Redis連接建立、釋放連接的開銷,從而提高性能。

Pipelining

Using pipelining to speedup Redis queries: https://redis.io/topics/pipelining

It is a technique, using this it is possible for a client application to send multiple commands to the server without waiting for the replies at all, and finally read the replies in a single step. Reduce the latency cost due to the round-trip time. Improves by a huge amount the total operations you can perform per second in a given Redis server. When pipelining is used, many commands are usually read with a single read() system call, and multiple replies are delivered with a single write() system call. Because of this, the number of total queries performed per second initially increases almost up to 10 times in redis server compared with not using pipelining.

Redis默認每次執行請求都會創建和斷開一次連接池的操作,如果想執行多條命令的時候會在這件事情上消耗過多的時間,因此我們可以使用Redis的管道來一次性發送多條命令並返回多個結果,節約發送命令和創建連接的時間提升效率。

Redis Pipelining 可以在服務端未響應時,客戶端可以繼續向服務端發送請求,並最終一次性讀取所有服務端的響應。Pipeline 在某些場景下非常有用,比如有多個 command 需要被“及時的”提交,而且他們對相應結果沒有互相依賴,對結果響應也無需立即獲得,那麼 pipeline 就可以充當這種“批處理”的工具;而且在一定程度上,可以較大的提升性能,性能提升的原因主要是 TCP 連接中減少了“交互往返”的時間。

使用上的重點!

1) 每個指令之前,需要確認命令是否有先後關係的相依性,有的話,不建議使用。

2) 主要是減少 TCP 連接的交互往返時間 -> 對跨AZ延遲會有改善,或是AZ之間短暫延遲有減緩的機會。

3) 大量的使用 Pipelining,不會有等比級數的效能改善,請記得 Redis 是單執行序服務,處理命令還是一個一個來執行。所以使用 Pipelining 大量傳送命令,100筆不一定比 50筆快,也不一定比10筆快。

Q: Why should a customer use pipelining? Advantages?

Able to process new requests even if the client didn’t already read the old responses. This way it is possible to send multiple commands to the server without waiting for the replies at all, and finally read the replies in a single step.

Timeout setting

不建議小於2秒鐘,因為實例設備及網路會有抖動或掉封包的機會,所以過低的 Timeout setting 反而容易造成問題,適當的 Timeout setting 值,搭配合理的 Re-try setting 是比較建議的。

Backoff Re-try setting

Error retries and exponential backoff in AWS — https://docs.aws.amazon.com/general/latest/gr/api-retries.html

The backoff algorithm. The idea behind exponential backoff is to use progressively longer waits between retries for consecutive error responses. You should implement a maximum delay interval, as well as a maximum number of retries. The maximum delay interval and maximTum number of retries are not necessarily fixed values, and should be set based on the operation being performed, as well as other local factors, such as network latency.

指數退避(Backoff),例如,10秒後第1次的錯誤重試,而第2次以後就變成12秒,第3次15秒的方式,來做指數退避(Backoff)的錯誤重試,這樣可以減緩當網路恢復後,同一時間所有的前端,都對後端發出連線的請求,造成後端 Redis 伺服器端的連線風暴(connection storm)。

實際測試 — Connection pooling

Using connection pooling can increase requests per second, and also reduce latency.
使用 connection pooling -k 1 可以確認,每秒可以處理的命令數變多了,而latency反而更低。

Target ElastiCache Redis: cache.r6g.large
Testing EC2 Client: r6g.large

測試記錄

Test1: No connection Pooling | No Pipelining

Test2: Had connection Pooling | No Pipelining

實際測試 — Pipelining

■ Using Pipelining and connection pooling that increase requests per second. However, latency also got increased. 使用 Pipelining -P 16: 可以明顯看到,每秒可以處理的命令數變多了,但latency也變高了。(此時已經使用 connection pooling -k 1)
■ Using Pipelining with too much requests (-P 256 or -P 4096) at the same time that total requests per second didn’t got increase. However, Pipelining with -P 256 or -P 4096 that latency will high than Pipelining with -P 16.
使用 Pipelining 同時並發過多時(-P 256 or -P 4096),反而會造成latency變高,但效能並沒有提升。

測試記錄

Target ElastiCache Redis: cache.r6g.large
■Testing EC2 Client: r6g.large

Test3: Had connection Pooling | Pipelining=16

Using Pipelining and connection pooling that increase requests per second. However, latency also got increased.
使用 Pipelining -P 16: 可以明顯看到,每秒可以處理的命令數變多了,但latency也變高了。(此時已經使用 connection pooling -k 1)

Test4: Had connection Pooling | Pipelining=256

Using Pipelining with too much requests (-P 256 or -P 4096) at the same time that total requests per second didn’t got increase. However, Pipelining with -P 256 or -P 4096 that latency will high than Pipelining with -P 16.
使用 Pipelining -P 256: 每秒可以處理的命令數跟 -P 16 是接近的,但latency也變更高了。(此時已經使用 connection pooling -k 1)

Test5: Had connection Pooling | Pipelining=4096

Using Pipelining with too much requests (-P 256 or -P 4096) at the same time that total requests per second didn’t got increase. However, Pipelining with -P 256 or -P 4096 that latency will high than Pipelining with -P 16.
使用 Pipelining -P 4096: 每秒可以處理的命令數跟 -P 16 是接近的,但latency還是比 -P 16 高。(此時已經使用 connection pooling -k 1)

實際測試 — Connection pooling (Python 代碼)

常見問題?

Q: 為什麼我們在不同 AZ的應用程序,訪問相同 AZ 中的 ElastiCache Redis 會有不同延遲的 ?

A: 不同可用區(AZ)之間,因為物理區域的關係,會有延遲上的差異,以下為參考值。

Client (AZ-A) -> Redis (AZ-A) Latency : 1ms

Client (AZ-B) -> Redis (AZ-A) Latency : 3ms+

Q: 我們的應用程序客戶端有 Redis 的連接延遲, 這裡有什麼建議給我們嗎?

A: 建議使用1) Connection Pooling、2) Pipelining、3) 放置在同一個AZ.

Q: 如果新增連接數 (NewConnections) 非常高,如何減少新連接?

A: 建議使用 Connection Pooling。

Q: 節點是否有最高連接數(maxclients)的上限?

A: YES。有 ElastiCache 有節點的最高連接數(maxclients) 65000 限制,而過多的連接數很容易造成服務延遲,故建議使用 Connection Pooling,並且避免過多的連線。

[] Redis-specific parameters — Redis 2.6.13 parameters — https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/ParameterGroups.Redis.html#ParameterGroups.Redis.2-6-13

maxclients This value applies to all instance types except those explicity specified
Default: 65000

Q: 連線是否會被服務器端主動中斷 (tcp-keepalive)?

No. ElastiCache 預設不會主動中斷客戶端連線。

[] Redis-specific parameters — Redis 2.6.13 parameters — https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/ParameterGroups.Redis.html#ParameterGroups.Redis.2-6-13

tcp-keepalive
Default: 0

If this is set to a nonzero value (N), node clients are polled every N seconds to ensure that they are still connected. With the default setting of 0, no such polling occurs.

Q: 如何Kill idle connections to Redis?

Q: 跨 VPC 的連通方式?

必要條件:
兩個 VPC 位於同一個地區。

方案一:
使用 VPC Peering 或 Transit Gateway 其中一中方式將兩個 VPC 聯通,來使得您的 Redis 連接封包,可以從實例所在的 VPC-A 發送到 VPC-B [+]。

[+] 用於存取 Amazon VPC 中 ElastiCache 叢集的存取模式 — 存取與 Amazon EC2 執行個體位於相同 Amazon VPC 中的 ElastiCache 叢集:
https://docs.aws.amazon.com/zh_tw/AmazonElastiCache/latest/red-ug/elasticache-vpc-accessing.html#elasticache-vpc-accessing-same-vpc

方案二:
可以通過 AWS Private Link 搭配 NLB 的方式訪問 [+],但此一方式您需要額外配置 NLB 及 Lambda,來讓您的 AWS 內部服務,例如 RDS 在有實例縮容 / 擴容 / 節點替換的時候能自動的更新上對應的節點 IP。

[+] AWS Blog — Access Amazon RDS across VPCs using AWS PrivateLink and Network Load Balancer:
https://aws.amazon.com/tw/blogs/database/access-amazon-rds-across-vpcs-using-aws-privatelink-and-network-load-balancer/

!!! 注意 !!!
第二個方案,只適用於 “單節點” 的架構,並且使用 VPC Endpoint,還必需搭配 AWS Private Link,故第二個方案目前並不推薦。

--

--

Jerry’s Notes
What’s next?

An cloud support engineer focus on troubleshooting with customer reported issue ,and cloud solution architecture.