TLDR Using cgroups to limit I/O by André Carvalho

Cgroups V1 (and hense docker too) can’t limit non O_DIRECT io, while cgroups v2 can!

pavel trukhanov
some-tech-tldrs
2 min readAug 2, 2018

--

Docker relies on a linux kernel feature, called cgroups, to be able to limit a process resource (CPU, memory, disk iops and bytes pers second) usage.

Managing cgroups is done by interacting with the cgroup filesystem, by creating directories and writing to certain files.

Cgroups v1 has a per-resource (memory, blkio etc) hierarchy of control groups.

Each process resides in exactly one cgroup per resource. If a process is not assigned to a specific cgroup for a resource, it is in the root cgroup for that particular resource.

Important: Even if there’s two cgroups — one for resourceA (say, block io) and one for resourceB (say, memory), have the same name (that’s how docker does it), they are independent.

Why this matters

If we try to limit disk write throughput of some process — cgroups v1 can’t do that, because written files (usually) are first put into the kernel page cache, that gets flushed to disk later.

Since in cgroups v1, different resources/controllers (memory, blkio) live in different hierarchies on the filesystem, even when those cgroups have the same name, they are completely independent. So, when the memory page is finally being flushed to disk, there is no way that the memory controller can know what blkio cgroup wrote that page.

For I/O limits to work when this I/O hits the page cache, both memory and io controllers should be controlled by cgroups v2! Because in cgroups v2 there is only a single hierarchy of control groups (instead of one hierarchy for resource) page cache writeback may account for process io even if it’s done asynchronously by kernel.

Try it yourself (check if you need to change /dev/sda to something)

then

VS

TLDR

Cgroups v1 suck, cgroups v2 are better =)

Docker IO limits, like --device-read-iops --device-write-iops --device-read-bps --device-write-bps, will work only for O_DIRECT io, or when docker starts to use unified cgroups v2 hierarchies.

--

--