Understanding The memcached Source Code — Slab I

Photo by Brooke Lark on Unsplash

slab allocator (this article) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely,

LRU algorithm (not complete) for entry expiration; and an

event driven model (not complete) based on libevent; and the

consistent harsh (not complete) for data distribution,

are built around it.

Variants of slab allocator is implemented in other systems, such as nginx and Linux kernel, to fight an common problem called memory fragmentation. And this article will, of course, focuses on memcached’s implementation of the algorithm.

memcached version: 1.4.28

Firstly, let’s answer some questions.

Introduction

What is a slab

slabs are pre-allocated 1M memory chunks that can be subdivided for numerous objects. They are grouped into slab classes to serve allocation requests for various sizes.

What is memory fragmentation, how it occurs

In particular, slab allocator curbs internal memory fragmentation. This kind of fragmentation exits within a allocated memory chunk. In the context of OS kernel, for instance, the fundamental unit allocated by memory management sub-system is called a page.

On the other hand, external memory fragmentation exists across chunks, and the solution of which (keyword: buddy) belongs to another story.

The most common phenomenon where internal fragmentation causes problem is as following:

1) malloc of small objects are called a lot of times; and in the mean time;

2) free of those objects are called a lot of times.

The above process generates (a lot of) nominal “free” memory that can not be utilized, as the discrete holes of various sizes, or fragments, can not be reused by subsequent mallocs for any objects that are larger than them.

Why memory fragmentation is bad

The impact of memory fragmentation is similar to that of memory leak — periodical system reboot is inevitable whenever the fragments accumulate to a certain level, which, increase the complexity in system operation, or even worse, leads to bad user experiences.

How the problem is fixed

Slab allocator does not eliminate internal fragmentation. Instead, it converges the fragments and locks them in fixated memory locations. This is done by 1) categorizing objects of similar sizes in classes; and 2) allocating objects belonging to the same class only on the same group of “slabs”, or, a slab class.

The detail devil is in the code, so we start reading the code.

reminder: memcached version is 1.4.28

The core data structure in use

Module initialization

In this section we examine slabs_init that initializes slabclass[MAX_NUMBER_OF_SLAB_CLASSES] array. In particular, this process initializes the values of two fields, i.e., slabclass_t.size, the item (object) size of each slab class, and slabclass_t.perslab the item number one slab contains. This method is called from here as one of the init steps before the logic enters the main even loop.

SourceRead

In this step slab_sizes and settings.factor jointly control the routes in which sizes of each slab class are decided, they are:

a) if slab_sizes is not NULL, the values within the array are used directly; and

b) otherwise, the sizes are calculated as base size × n × settings.factor where n is the index within slabclass.

Besides the default values, the two arguments can be set at run time as well.

SourceRead

The other two arguments of this method settings.maxbytes and preallocate will be discussed soon. For now we set false to preallocate and ignore the relevant logic flow.

Next we look at the slabs_init itself.

SourceRead

Route a

1) use the values in slab_sizes;

2) align the size to CHUNK_ALIGN_BYTES, and give the result to slabclass[i].size;

3) calculate the slabclass[i].perslab;

5) use the settings.item_size_max to initialize the last slab class.

Note that settings.item_size_max is the size of each slab, hence it is also the max size of items that is allocated on slabs. Likewise, the value of settings.item_size_max can be set in run time

SourceRead

Route b

1) calculate the base size with settings.chunk_size plus the extra bytes for meta data (item will be discussed in following articles);

2) align the size to CHUNK_ALIGN_BYTES, and give the result to slabclass[i].size; (same to route a)

3) calculate the slabclass[i].perslab; (same to route a)

4) calculate the size for the next slab class using factor (settings.factor);

5) use the settings.item_size_max to initialize the last slab class. (same to route a)

References

memcached wiki
第2回 memcachedのメモリストレージを理解する
Memcached源码分析之存储机制Slabs(7)
Understanding Malloc
Ch8 — Slab Allocator


Originally published at holmeshe.me.