Key-value storage on top of Mysql

Key-value storage on top of Mysql

Denys Golotiuk


Tons of cases of storing data don’t require using SQL or tables at all. Having simple ways to save something somewhere and reading that later is more than enough. Imagine saving user settings or all sort of flags (e.g. when user viewed/liked/commented certain piece of content). In most cases you won’t need that kind of data to filter/sort/group based on.

Long story short — sometimes you work with data you don’t need to do table scans for. These are so called “key-value” cases. So why use tables at all? Well, you can use No-SQL solution, but are you brawe enough to manage multiple DBs after that? Naah, not always is this a good idea. That’s why let’s learn how to have efficient key-value storage using our favorite Mysql server.

InnoDB and clustered indexing

As we’ve discussed in tuning Mysql, InnoDB uses clustered indexes. This ensures optimized way of accessing row data when being looked up by primary key:

That’s why queries like:

SELECT * FROM table WHERE id = 1;

are highly efficient if id is a primary key. And that means we can build our own efficient key-value storage on top of InnoDB tables.

Key-value table structure

Let’s create simple structure with 2 columns — key and value:

`key` varchar(32) NOT NULL PRIMARY KEY,
`val` JSON,

We will store values in value column JSON encoded.

Storing data

We’ll use INSERT...ON DUPLICATE KEY UPDATE query to either insert or update value by key:

SET val = '{"name": "val"}', `key` = 'test'
ON DUPLICATE KEY UPDATE val = '{"name": "val"}';

This will automatically create new row or update existing row for a given key.

Reading data

In order to fetch data by key column we simply select:

SELECT val FROM kv WHERE `key` = 'test' LIMIT 1;

This will return JSON-encoded value for specified key.

Optimizing for performance

First of all, having key-value storage on top of Mysql gives all the benefits of Mysql. This includes tuning for performance, replication for availability, scaling and other stuff. Still, there are multiple things we should consider improving further for our solution.

1. Limit key size

Try keeping key size as small as possible, this will decrease index size and fit more data into memory. If you plan to use long keys (more than 40 symbols), good idea is to use sha1() checksum for keys in order to convert it to 40-symbol strings:

SELECT sha1('this_is_a_vary_long_key_to_store_in_our_key_value_storage');

This gives the following string:

| sha1('this_is_a_vary_long_key_to_store_in_our_key_value_storage') |
| 8693e6100a85a05a4cc2f7f089ee3858933d61ab |

And then insert/select by encoded key value:

SELECT val FROM kv WHERE `key` = sha1('long_key_name...') LIMIT 1;

2. Use scalar values if possible

If you plan to use simple types only (e.g. boolean flags, integer numbers or short strings), you can skip json-encoding:

`key` varchar(40) NOT NULL PRIMARY KEY,

This will not only allow to fit more rows into memory, but also enable type-specific operations possible. For example, numeric types will allow implementing atomic increment operation:

UPDATE kv SET val = val + 1 WHERE `key` = 'test';

3. Use multiple tables

In order to implement namespaces, you can use multiple tables instead of combining all the keys inside one table:

CREATE TABLE `kv_settings` (`key` varchar(40) NOT NULL PRIMARY KEY, `val` TEXT,         PRIMARY KEY (`key`);
CREATE TABLE `kv_flags` (`key` varchar(16) NOT NULL PRIMARY KEY, `val` INT UNSIGNED, PRIMARY KEY (`key`);


Key-value storage on top of Mysql InnoDB engine can easily handle billions of keys with efficiency and reliability of Mysql. This approach is used in Mysqly key-value storage component. This solution can be easily extended on top of powerful Mysql infrastructure. For example, adding expiration functionality will allow implementing caching system.



Denys Golotiuk

Data-intensive apps engineer, tech writer, opensource contributor @