CPU Throttling: Unbundled

raman gupta
5 min readDec 19, 2022

Overview

Resources usage metrics like CPU & memory used to be the two most relevant ones but with support for Cgroups & CPU bandwidth control mechanism in Linux kernel, CPU throttling has also become one of the critical metrics of interest in containerized deployments to debug/diagnose any latency related issues. Containerized Apps can experience CPU throttling even when overall host machine CPU usage is low. Cgroups (control groups) is a Linux kernel feature that limits, accounts for, and isolates a collection of processes' resource usage (CPU, memory, disk I/O, network, etc.).

When hard CPU limits are set in a container orchestrator, the kernel’s scheduler(CFS) uses bandwidth control mechanisms to enforce those limits.

CPU time allocation

CPU time is divided into periods (default 100ms) and every Cgroup is allocated quota within that period. Quota defines the max time in microseconds where current Cgroup processes would be allowed to run. For instance, if it is set to half of the CPU period, that Cgroup will only be able to peak run for 50 % of the time i.e 50ms. This represents aggregate time over all CPUs in the system. Therefore, in order to allow the total usage of two CPUs, for instance, one should set this value to twice the value of the CPU period.

Following are the main configs managing the CPU time allocation to Cgroups

CPU shares

CPU shares (cpu_share) are a feature of Linux Control Groups (Cgroup). CPU shares control over how much CPU time a process in a container can use. Container in this context means a set of processes running in the same cgroup. This definition is applicable to:

  • Docker containers
  • Pods in Kubernetes
  • Any systems that use Cgroups

CPU share is always relative and it has no meaning in isolation. Setting one container’s cpu_share to 512 and another container’s to 1024 means that the second container will get double the amount of CPU time as the first.

For example, if we have three containers on a machine with 1 & 2 cores respectively following would be the CPU allocation for different CPU shares config.

CPU Shares allocation

The above figures for how much CPU time a container has access to are only valid if every container wants to execute at the same time. If only one container is active it can use all the CPU. If more than one container executes at once then the relative cpu_shares dictate how much CPU time each container has access to. The cpu_shares of inactive containers are irrelevant. So one way to think about shares is that it provides lower-bound provisioning.

Docker uses a base value of 1024 for cpu_shares. To give a container relatively more CPU set --cpu-shares for Docker run to a value greater than 1024. To give a container relatively less CPU time set --cpu-shares to lower than 1024

cpu_period

The duration in microseconds of each scheduler period. This defaults to 100000us or 100ms. Larger periods will improve throughput at the expense of latency since the scheduler will be able to sustain a CPU-bound workload for longer. It has a default value of 100ms.

cpu_quota

The maximum time in microseconds during each cpu_period in for the current group will be allowed to run. For instance, if it is set to half of cpu_period, the cgroup will only be able to peak run for 50 % of the time. It has a default value of -1 (It means that there is no upper bound set on the usage).

If --cpus or --cpu-quota is set then even if there is no contention for CPU time the container will be throttled. It can be good for predictable resource usage. So these configs provide upper-bound provisioning.

Bandwidth control system

The bandwidth allowed for a Cgroup is specified using a quota and period(as specified above). It controls the resource consumption of Cgroups through CPU throttling. Therefore, when processes in a container use more resources than what the CPU quota specifies, these processes will be throttled by the CPU. The CPU time they use will be limited, and some key latency indicators in these processes will deteriorate. Throttled processes will not be able to run again until the next period when the quota is replenished.

The quota doesn’t limit how many cores the container can use at any given moment. Instead, if a container wants to use more than specified cores over a quota time slice, it will use more cores than its quota for a short period of time and then get throttled.

It's in terms of time and not core so if thread-pool exhausts all the CPUs then it would be throttled.

CPU Throttling

In Figure 1.1 application with a quota of two cores and two runnable threads on four core machine is not throttled as it's not exceeding its allocated quota of 200ms.

In Figure 1.2 application with a quota of two cores and four runnable threads on four core machine will be throttled as it will consume all of its allocated quotas (by utilizing four cores) in 50ms and will throttle for the rest of the period. It will again be scheduled in the next period.

Kubernetes Behaviour

A Kubernetes resource can be set as a request or a limit. When set as a request then cpu_shares are used to specify how much of the available CPU cycles a container gets. A Kubernetes limit results in cpu_quota and cpu_period being used.

The Kubernetes scheduler uses the request value for scheduling. In K8s the containers scheduled to a machine cannot have a CPU count greater than the actual number of CPUs on a machine. This means that cpu_shares when used in Kubernetes gives more information than when they are used in isolation. The total of the cpu_shares for all the containers on a Kubernetes node cannot exceed (Num of Cores) x 1024. This results in the request value acting as a minimum amount of CPU time for a container.

Key Metrics

The following are the key metrics to look for to diagnose CPU throttling-related issues.

  • container_cpu_cfs_throttled_periods_total: This metric gives you how many periods the container was throttled.
  • container_cpu_cfs_periods_total: Number of elapsed period intervals.
  • container_cpu_cfs_throttled_seconds_total: Total time duration the container has been throttled.

The following could be the key reasons for containers hitting CPU throttling in Kubernetes

  • Sometimes GC gets triggered frequently which takes CPU cycles. It can be caused by incorrect GC configuration and can result in app throttling.
  • The low CPU limit of containers causes the application to hit the limits quickly causing Kubernetes to throttle it.

Appendix

  1. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/ch01
  2. https://kernel.googlesource.com/pub/scm/linux/kernel/git/glommer/memcg/+/cpu_stat/Documentation/cgroups/cpu.txt
  3. https://docs.kernel.org/scheduler/sched-bwc.html
  4. https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md

--

--