Redis Data Compression — Byte manipulation for Hex strings in Lua

Kartik Mittal
Analytics Vidhya
Published in
4 min readApr 4, 2020
Photo from: https://promotion.aliyun.com/ntms/act/redisluaedu.html

There comes a time when you have to find a new way of data compression and make it work with the existing constructs. The task becomes much more challenging when you are dealing with queries at a high scale on live running-counters stored in a Redis store.

What this article offers —

  1. An idea to transform Redis keys from list data type to hash
  2. Efficient string manipulation to represent Integer values in Hexadecimal using Lua.
  3. Simple steps to use Lua debugger with redis-cli.
  4. Using redis-benchmark for performance evaluation.

NOTE : Assuming the readers to be familiar with Lua scripts and how we can use EVAL functionality of redis to use Lua scripts.

Use Cases to Support —

  1. Reduce the key size/ space
  2. The key should store day wise counter (integer value).
  3. The operations would be to increase/ decrease the counter for a specific day/ index.

Approach —

  1. Data Type Transformation for compression

To reduce the key space and compress data in Redis, let’s say we decide to use hashes, but we cannot directly store the day wise integer value in form of a concatenated string, we thought of two ways —

  • We use a delimiter (ex: semicolon, ampersand etc.) : The difficulty of string manipulation in Lua and the fear of inconsistency makes it a not really robust solution.
  • Encoding Integer in Hexadecimal value where two bytes represented by 4 characters : From storing counter for each day in form of a list, this gives flexibility to store it in hash key. But this becomes challenging as we no longer have the flexibility of using list operations like lindex list operation to jump to a specific day for lookup.
127.0.0.1:6379> TYPE oldkey
list
127.0.0.1:6379> LRANGE oldkey 0 -1
1) “280”
2) “150”
3) “250”
4) “560”
127.0.0.1:6379> type newkey
hash
127.0.0.1:6379> HGETALL newkey
1) "counts"
2) "0118009600fa0230"

“0118009600fa0230" represents —

  1. 0x0118 = 280
  2. 0x0096 = 150
  3. 0x00fa = 250
  4. 0x0230 = 560

2. Reading and manipulating counter for a specific index in Lua

-- keys 
-- 1 = rediskey
-- args
-- 1 = dayindex
-- 2 = delta
redis.replicate_commands()
local getData = function(str, index)
local count = str:sub(index*4 -3, index*4)
return tonumber("0x"..count)
end
local setData = function(str, index, data)
local str1, str2

if index == 1 then
str1 = ""
else
str1 = str:sub(1, index*4 - 4)
end
if index*4 == string.len(str) then
str2 = ""
else
str2 = str:sub(index*4 + 1, string.len(str))
end
local newcount = string.format('%04X', tonumber(data))
return str1 .. newcount .. str2
end
local dayindex = ARGV[1]
local delta = ARGV[2]
local strval = redis.call('hget', KEYS[1], "counts")
local count = getData(strval, dayindex)
strval = setData(strval, dayindex, count - delta)
redis.call('hset', KEYS[1], strval)

3. Using Lua Debugger —

  • $redis-cli -p 6379 --ldb --eval basichex.lua newkey , 2 5
Lua debugging session started, please use:
quit -- End the session.
restart -- Restart the script in debug mode again.
help -- Show Lua script debugging commands.
* Stopped at 7, stop reason = step over
-> 7 redis.replicate_commands()
lua debugger> b 35 36 38
34
#35 local count = getData(strval, dayindex)
36 strval = setData(strval, dayindex, count - delta)
#35 local count = getData(strval, dayindex)
#36 strval = setData(strval, dayindex, count - delta)
37
37
#38 redis.call('hset', KEYS[1], strval)
lua debugger> c
* Stopped at 35, stop reason = break point
->#35 local count = getData(strval, dayindex)
lua debugger> p
<value> getData = "function@0x7fbd51869350"
<value> setData = "function@0x7fbd51869380"
<value> dayindex = "2"
<value> delta = "5"
<value> strval = "0118009600fa0230"
lua debugger> c
* Stopped at 36, stop reason = break point
->#36 strval = setData(strval, dayindex, count - delta)
lua debugger> p
<value> getData = "function@0x7fbd51869350"
<value> setData = "function@0x7fbd51869380"
<value> dayindex = "2"
<value> delta = "5"
<value> strval = "0118009600fa0230"
<value> count = 150
lua debugger> c
* Stopped at 38, stop reason = break point
->#38 redis.call('hset', KEYS[1], strval)
lua debugger> p
<value> getData = "function@0x7fbd51869350"
<value> setData = "function@0x7fbd51869380"
<value> dayindex = "2"
<value> delta = "5"
<value> strval = "0118009100fa0230"
<value> count = 150
lua debugger>

4. Redis Benchmark —

  • First we compute SHA hash of the scripts to be used in EVAL
SHA1=`redis-cli SCRIPT LOAD "$(cat oldscript.lua)"`
SHA2=`redis-cli SCRIPT LOAD "$(cat basichex.lua)"`
  • Old Script —

$redis-benchmark -n 10000 -e EVALSHA $SHA1 1 oldkey 2 5

====== EVALSHA 6ec917c7503059c00a94e6187eef67176f701458 1 oldkey 2 5 ======
10000 requests completed in 7.89 seconds
50 parallel clients
3 bytes payload
keep alive: 1
0.01% <= 36 milliseconds
7.09% <= 37 milliseconds
41.06% <= 38 milliseconds
64.60% <= 39 milliseconds
77.40% <= 40 milliseconds
85.24% <= 41 milliseconds
91.63% <= 42 milliseconds
95.33% <= 43 milliseconds
97.52% <= 44 milliseconds
98.67% <= 45 milliseconds
99.36% <= 46 milliseconds
99.51% <= 47 milliseconds
99.61% <= 49 milliseconds
99.63% <= 51 milliseconds
99.70% <= 52 milliseconds
99.77% <= 53 milliseconds
99.85% <= 54 milliseconds
99.96% <= 278 milliseconds
99.98% <= 279 milliseconds
99.99% <= 280 milliseconds
100.00% <= 280 milliseconds
1267.43 requests per second
  • New Script —

$redis-benchmark -n 10000 -e EVALSHA $SHA1 1 newkey 2 5

====== EVALSHA 1dbad2cd256a4266ab7d680ae8059906ae40e8e5 1 newkey 2 5 ======
10000 requests completed in 4.35 seconds
50 parallel clients
3 bytes payload
keep alive: 1
0.01% <= 18 milliseconds
1.69% <= 19 milliseconds
20.06% <= 20 milliseconds
48.56% <= 21 milliseconds
68.25% <= 22 milliseconds
81.39% <= 23 milliseconds
89.75% <= 24 milliseconds
95.01% <= 25 milliseconds
96.75% <= 26 milliseconds
97.90% <= 27 milliseconds
98.63% <= 28 milliseconds
99.25% <= 29 milliseconds
99.50% <= 30 milliseconds
99.63% <= 31 milliseconds
99.68% <= 32 milliseconds
99.84% <= 33 milliseconds
99.86% <= 34 milliseconds
99.88% <= 35 milliseconds
100.00% <= 35 milliseconds
2297.79 requests per second

Almost 2x improvement in the performance :)

Resources —

  1. Lua debugger
  2. Redis benchmark

--

--

Kartik Mittal
Analytics Vidhya

A software engineer, passionate about learning new things and growing along the way!