Memcached key distribution

由於 Memcached 並不支援數據同步(Replication),也沒有像 Redis 有類似 Hash_slot 管理鍵值的做法,所以 Memcached Client 要自行管理及使用適合的 Key distribution algorithms 來決定您的鍵值,要寫到那一個 Memcached 節點。而不同的 Key distribution algorithms 會因為增減節點的過程中,造成巨大的差異,以下會介紹兩個常用的 Key distribution algorithms : Modulo Hashing Consistent Hashing 他們的優缺點。

Jerry’s Notes
What’s next?
2 min readMar 23, 2022

--

Modulo Hashing — 節點取余數為區

根據節點數量的餘數進行分散。先以鍵值整理的哈希值 HashCODE, 再除以節點數量,再根據其餘數來選擇服務器。

優缺點: 簡單,但節點增加的過程中緩存會失敗,所有的緩存都會失效。數據需要重新再次緩存。(n -1)/n 的數據會受到影響。

■ A simple way to load balance a cluster with n nodes is to calculate the hash of the object’s key and mod the result by n — hash(key) mod n. The resulting value (0 through n–1) is the number of the node where you place the object.

CRC16(KEY) mod nodes = ServerID
CRC16(KEY) % 3 ===> Read/Write key/value into
ServerID

Consistent Hashing — 一致性哈希分區

1)首先求出memcached服務器(節點)的哈希值, 並將其配置到0~232的圓(continuum)上。

2)然後用同樣的方法求出存儲數據的鍵的哈希值,並映射到圓上。

3) 然後從數據映射到的位置開始順時針查找,將數據保存到找到的第一個服務器上,如果超過232仍然找不到服務器,就會保存到第一臺memcached服務器上。

4) 果添加了一台MemCached服務器,只在圓上增加服務器的逆時針方向的第一台服務器上的鍵會受到影響。

優點: 在多節點環境下,增減節點時,只會影響到部份節點上的數據。受影的數據就是1/n,當節點數n越多時,影響越小。

■ Consistent hashing uses an algorithm such that whenever a node is added or removed from a cluster, the number of keys that must be moved is roughly 1 / n (where n is the new number of nodes). Scaling from 1 to 2 nodes results in 1/2 (50 percent) of the keys being moved, the worst case. Scaling from 9 to 10 nodes results in 1/10 (10 percent) of the keys being moved.

Nodes:
hash-code = hash(cache-A)
hash-code = hash(cache-B
hash-code = hash(cache-C

Key:
hash-code = hash(key1)
hash-code = hash(key2)

Q: How are the keys distributed and retrieved across different nodes of the same cluster?

Since Memcached doesn’t have any synchronization/replication feature, every node will have a unique set of keys. The following are common sharding algorithm.

Q. Is it possible to use custom hash algorithm?

■ Yes. Customer can use their own algorithm for storing the data set.

For example, customer can store keys in different nodes according to the size of the data.

■ It can reduce reassigning slabs or slave automove for making a room when there’s no page for the keys newly added.

■ The best hash algorithm depends on customer’s own environment.

Q: Key hashing algorithm: 不同的架構在增減節點時會有什麼影響?

Modulo Hashing — Formula: (n -1)/n。增減節點時,(n -1)/n的數據會受到影響。

Consistent Hashing -。Formula: 1 / n。增減節點時,只有 1/n的數據會受到影響。

!!! 綜合以上說明,會建議客戶使用 Consistent Hashing(一致性哈希分區)。

--

--

Jerry’s Notes
What’s next?

An cloud support engineer focus on troubleshooting with customer reported issue ,and cloud solution architecture.