Redis Backup BGSAVE如何運行?

由於 Redis 是單執行序服務，所以當備份命令 BGSAVE 開始執行之後，Redis 父進程會 fork 出一個新子進程來執行備份作業，由原來的 Redis 進程(父進程)繼續處理客戶端的請求，而子進程則負責將數據下載並保存到磁盤(RDB檔)中，此時子程序會先去複制父進程中的”內存頁表(memory mapping table)”，一般來說10GB內存的數據量，會有20MB的資料，所以這個動作，會造成原來的 Redis 進程(父進程)會有短暫的阻塞(stuck)。而在快照備份的過程中，新增的異動資料，子進程會使用“COPY on Write” 的方式去使用額外的內存空間，來存放異動前的資料。

Jerry’s Notes

Published in

What’s next?

16 min readMar 18, 2022

!!! 若在備份的過程中，若使用者異動”所有”內存內的數據時，會使用額外1倍的內存空間來存放，備份當下(Point in time)的數據、及異動後的數據，進而造成內存空間不足的問題，故請避免在業務忙碌的時間，來執行快照備份作業。

Redis Persistence - Redis

This page provides a technical description of Redis persistence, it is a suggested read for all Redis users. For a…

redis.io

Redis provides a different range of persistence options:

!!!Redis Persistence 會使用兩個方式 BGSAVE 及 AOF，這邊著重說明 BGSAVE。

RDB (Redis Database)

■ The RDB persistence performs point-in-time snapshots of your dataset at specified intervals. 將Redis內存中的數據進行Snaptshot快照存儲在磁盤內。
■ Dump all data in Redis to a file which is called RDB file. 內存內所有鍵值，但不包含過期鍵值。
■ ElastiCache sends the file to the S3. Later can restore a Redis cluster or Replication Group from the snapshot.
■ Snapshot does not include expired items. 不包含過期鍵值，No Expire Key。
■ Snapshot (RDB file) is compressed by default. 預設壓縮。縮後的二進制文件，適用於備份、全量複製及災難恢復，但不同版本直接存在兼容性問題。不同鍵值類型時，會使用不同的方式來保存數據。Rdbcompression: Enable(預設是開啟的)
■ Snapshot is done at the Cluster level and not at the Shard level even though separate backup files will be created for each shard. 以分片組(shard)為單位，來進行備份，而不是以整個cluster為單位。
■父進程無須執行任何磁盤 I/O 操作，fork() 出子進程 (child process) 在背景進行備份。
■ 優點: 備份頻率較 AOF 低，但檔案小、適合作為災難還原的備份檔
■ 缺點: 當服務異常停止時，部分數據可能會遺失，所以要使用多副本節點(replica node)，來降低這樣的風險。

$ cat redis.conf
#RDB持久化策略 默認三種方式，[900秒內有1次修改],[300秒內有10次修改],[60秒內有10000次修改]即觸發RDB持久化，我們可以手動修改該參數或新增策略
save 900 1
save 300 10
save 60 10000#RDB文件名
dbfilename "dump.rdb"#RDB文件存儲路徑
dir "/opt/app/redis6/data"

### AOF 及 RDB 混合使用 ###
aof-use-rdb-preamble yes

Q: SAVE、BGSAVE 命令的差別？

SAVE:  中斷連線，原本的程序直接做儲存的動作，直到備份完成。
127.0.0.1:6379> SAVE
OKBGSAVE: 產生子程序來執行備份作業
127.0.0.1:6379> BGSAVE
Background saving started

AOF (Append Only File)

The AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log in the background when it gets too big.
■ AOF advantages
You can have different fsync policies: no fsync at all, fsync every second, fsync at every query; AOF contains a log of all the operations one after the other in an easy to understand and parse format. You can even easily export an AOF file.
■ AOF Disadvantages
AOF files are usually bigger than the equivalent RDB files for the same dataset; AOF can be slower than RDB depending on the exact fsync policy.
It may take time restore the cache from a big AOF file.
■ AOF limitation
1. Append-only files (AOF) aren’t supported for cache.t1.micro and cache.t2.* nodes. For nodes of these types, the appendonly parameter value is ignored.
2. For Multi-AZ replication groups, AOF isn’t enabled.
3. AOF isn’t supported on Redis versions 2.8.22 and later.
■ 優點: 紀錄所有寫入 (Write) 動作的，備份頻率、完整性較 RDB 高。
■ 缺點: 檔案大，還原時間較長，不適合備份。在相同數據量時，AOF會大於RDB檔(沒有壓縮)。
在兩種持久化設定都打開的情況下，由於每次的寫入操作皆會被保存進 AOF，使得它的資料完整度較高，因此 Redis 在重啟後會選擇讀取 AOF 檔案進行還原。若 AOF 設定為關閉的狀況，才會選用 RDB 進行還原。
Parameter: appendonly yes
ElastiCache Redis 2.8.22+ didn’t support
appendfsync: always(效能影響最大) | everysec | no 不同的持久化行為。
■ 當鍵值過期時的做法: 先從數據庫中刪除、寫一條 DEL 命令到 AOF檔中、回覆客戶端Null。
■ Command: aof_rewrite -重新產生AOF，該檔案只包含還原當前數據庫狀態所需命令。

!!! 在兩種持久化設定 (RDB/AOF) 都打開的情況下，由於每次的寫入操作皆會被保存進 AOF ，使得它的資料完整度較高，因此 Redis 在重啟後會選擇讀取 AOF 檔案進行還原。若 AOF 設定為關閉的狀況，才會選用 RDB 進行還原。

$ cat redis.conf
#開啟AOF持久化
appendonly yes
 
#AOF文件名
appendfilename "appendonly.aof"
 
#AOF文件存儲路徑 與RDB是同一個參數
dir "/opt/app/redis6/data"#AOF策略: [always:每個命令都記錄],[everysec:每秒記錄一次],[no]
appendfsync always
# appendfsync everysec
# appendfsync no#aof文件大小比起上次重寫時的大小,增長100%(配置可以大於100%)時,觸發重寫。[假如上次重寫後大小為10MB，當AOF文件達到20MB時也會再次觸發重寫，以此類推]
auto-aof-rewrite-percentage 100 
 
#aof文件大小超過64MB時,觸發重寫
auto-aof-rewrite-min-size 64mb

Snapshot backup 備份作業可能的影響!!

由於 ElastiCache Redis 備份會使用 Redis 原生的 basave (Forked backup)，但可用內存較低時，也會使用Amazon ElastiCache Redis 自行開發的備份方式(Fork-less)來執行備份作業。但在執行備份的過程中，會使用額外的CPU負載及內存空間，來處理備份作業過程中的前端讀取請求及同步工作，若記憶體空間不足時，而使用到暫存空間 SWAP 而造成 CPU 負載大幅增加，進而造成備份失敗或是，該節點的 Redis 服務崩潰。另外，由於 Redis 服務是單執行序服務，當 CPU 負載處於高負載的狀態底下，也會造成前端連線命令執行失敗。
以下是 Redis 如何執行備份作業的官方文檔，提供您參考。
[+] Redis Persistence :
https://redis.io/topics/persistence
[+] Backup and restore for ElastiCache for Redis :
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/backups.html#backups-performance

Q: 手動備份redis時，集群的狀態將變為快照，這個過程會丟失數據嗎？

原則上是不會的。
但，若您在備份執行的過程中，若您的前端應用，仍有”大量”的讀寫操作，就有可能對您的前端應用造成影響，最差的狀況會造成備份失敗或是 Redis 服務崩潰。
所以建議您，在您 Redis 集群的”只讀副本節點”上執行，也建議您，在您業務”相對”離峰時間來執行備份，來減低風險。

Q: What’s “COPY on Write” of BGSAVE?

If data is added or updated, it is copied to the child process and OS allocates memory to the process to store. 在bgsave(fork process)的過程中，若有新增或是異動的資料時，子程序(child process)才會去複制該數據到額外的內存去存放。
■ SAVE: stuck redis unitl backup finished. Redis Save 命令執行一個同步保存操作，將當前 Redis 實例的所有數據快照(snapshot)以 RDB 文件的形式保存到硬盤。
■ BGREWRITEAOF: Redis Bgrewriteaof 命令用於異步執行一個 AOF（AppendOnly File）文件重寫操作。重寫會創建一個當前 AOF 文件的體積優化版本。即使 Bgrewriteaof 執行失敗，也不會有任何數據丟失，因為舊的 AOF 文件在 Bgrewriteaof 成功之前不會被修改。另外若BGSAVE執行中的話，Bgrewriteaof也會等待BGSAVE完成後才會執行((被推延))。

Q: Why ElastiCache didn’t support AOF after engine version 2.8.22?

AOF is disabled by default. To enable AOF for a cluster running Redis, you must create a parameter group with the appendonly parameter set to yes. You then assign that parameter group to your cluster. You can also modify the appendfsync parameter to control how often Redis writes to the AOF file.
■ 主要理由是因為點節點發生故障時，EalsatiCache會直接更換該節點，所以AOF無法避免資料遺失，故使用多從節點來同步(replication)數據，才是比較佳的做法。另外AOF在還原數據時，所花費的時間也較使用RDB檔來得時間長，另外AOF 開啟 always 寫入時，也會因為磁盤寫入問題，而造成 Redis Engine stuck。

Q: 為什麼拿備份檔，來創建新的 ElastiCache Redis Cluster 失敗?

■ 從Cluster mode enabled的備份檔，是無法創建出Cluster mode disabled的 ElastiCache Redis。
■ 從 Cluster mode disabled 的備份檔，若是使用多個 database 時，是無法創建出Cluster mode Enabled 的 ElastiCache Redis，若是只使用1個 database 是可以的。(因為原生 Redis 在 Cluster mode 下，是不支援多個 database的)。
■ 創建的 ElastiCache Redis 遇到機器，在特定AZ、特定機型數量不足的問題。
■ 客戶使用的 Custom Subnet 網段內，Private IP 數量不足，來創建新的 Network Interface (ENI)。
■ 使用 Data tiering 功能的 ElastiCache Redis，所創建的備份檔，是無法創建非 r6gd 機型的 ElastiCache Redis。

Q: 將 redis (非集群模式) 變更為 redis (集群模式)?

一般情況下我們會請客戶使用 backup/restore 的方式來達成，若客戶的 redis 中包含多個 database，但 redis 集群模式下僅能支援一個 database，若透過 backup/restore 來創建，會導致創建失敗。
!!! 因為原生 Redis 在 Cluster mode 下，是不支援多個 database的。
解決方式 :
1. 透過編輯 rdb 文件來將多個 database 合併為一個
2. 使用 redis [move](https://redis.io/commands/move) 指令來將 key 在不同 database 轉移。 (限制： When key already exists in the destination database, or it does not exist in the source database, it does nothing.)

以下的示例是使用 lua script，來達到自動化搬移 key，思路是使用 SCAN 來掃描來源 database (sdb)，並透過 move 將 key 搬移至目地 database (ddb)。

# cat tt_sdb_ddb.lua
local cursor = '0'
local conflict = {}
local moved = {}
local sdb = '1'
local ddb = '0'redis.replicate_commands()
redis.call("select",sdb)
repeat
local result = redis.call("SCAN", cursor)
cursor = result[1];
for _,key in ipairs(result[2]) do
redis.call('MOVE', key, ddb)
enduntil cursor == "0"使用 --eval 来执行此 lua script,  
# redis-cli -c -h xxx.apne1.cache.amazonaws.com --eval tt_sdb_ddb.lua 0

Q: 為什麼 ElastiCache Scaling 失敗???

■ 目標機型的內存空間，不足以去存放所有的數據。
■ 創建的 ElastiCache Redis 遇到機器，在特定AZ、特定機型數量不足的問題。
■ 客戶使用的 Custom Subnet 網段內，Private IP 數量不足，來創建新的 Network Interface (ENI)。
■ Scaling-In 失敗，在 Cluster mode enable 的 Redis cluster，在 rebalance slots 的過程中，因為 ElastiCache Redis 不會搬移鍵值大於250 MB的鍵值(BigKey)，所以該 slot 會留在該分片組 shard 中，所以會造成該分片組 shard 無法移除。

Q: RDB檔案大小，為什麼比我Redis Cluster內存使用量小???

因為 RDB 檔案，不會包含過期的鍵值，並且預設會使壓縮的方式存放。
https://redis.io/topics/faq
■ The RDB file will not include keys already expired in the master, but that are still in memory.
■ However these keys are still in the memory of the Redis master, even if logically expired. They’ll not be considered as existing, but the memory will be reclaimed later, both incrementally and explicitly on access. However while these keys are not logical part of the dataset, they are advertised in INFO output and by the DBSIZE command.

延伸閱讀 (Reference)