What Should be the Hash Structure of Redis for Millions of Keys?

Erman Terciyanlı
inavitas
Published in
2 min readJan 16, 2022

A simple hash structure or table is capable of storing key-value pairs and return them accordingly. However, when storing millions of values, optimizing the structure of that table becomes a challenge.

Using Redis to accomplish this will bring forth several advantages. First, Redis provides powerful aggregate types including sorted sets and lists. Second, Redis has a configurable persistence model which can save values in the background at a specified interval.

Here is how to create a hash structure that can store millions of keys.

Redis Hash Table Requirements

There are 4 rules that the hash structure needs to follow.

1. It should look up keys and return the corresponding values quickly.

2. It should fit the data in the existing memory; ideally in one of the EC2 high-memory types.

3. It should fit into the existing infrastructure without the need for upgrades.

4. It should be persistent so that no re-population is required.

A simple two-pronged approach would be sufficient to insert and return key-value pairs. The key could be called, ‘ID’, and the value could be simply called ‘value’.

Performance monitoring for a 500 MB database

How to Construct a Redis Hash Structure

Redis requires about 70MB to store 1 million keys. If hundreds of millions of key-value pairs need to be stored, gigabytes worth of storage would be required, at least.

Pieter Noordhuis, one of Redis’ core developers, says that Redis hashes are one of the best ways to store key-value pairs. There’s a specific setting, ‘hash-zipmap-max-entries’ which can configure a maximum number of entries whilst encoding them efficiently.

To take advantage of this data type, all the IDs should be bucketed into buckets of 1000. Hence, the ID needs to be divided by 1000 and the remainder should be discarded (ID%1000).

The code to return a value would look something like this:

HSET “value:1155” “1155315” “939”
HGET “value:1155” “1155315”
> “939”

Using a hash structure to store 1 million keys reduces the storage needed to 16 MB from 70 MB.

Alternative Method

Using a similar approach, one can also use buckets of 10000 and rely on Redis Bitmap. Here, we can manipulate bits at a given position. By doing this, one can simply check for the presence of a value instead of returning that value instead. This is perfect for metadata repositories.

--

--