Solving CPU throttling issue in Golang applications before hitting the CPU limit in Kubernetes.

Sharyash
5 min readMay 4, 2024

--

We faced an issue within our Kubernetes cluster wherein certain multi-threaded Golang applications, for which CPU limit has been set, are experiencing throttling before reaching their designated limits. For instance, there were pods with resources.requests.cpu sets to 2 and resources.limits.cpu set to 3 that underwent throttling despite their actual CPU usage remaining at 1 core :

CPU usage, requests and limits for a deployment containing three pods
throttling percentage of that deployment.

How does CPU Limit work?

To investigate the cause of this issue, we begin by examining how CPU requests and CPU limits function:

As outlined in the above link, the use of CPU limits for a process, unlike CPU requests which utilize the CPU shares system mechanism for setting requests, relies on the bandwidth control system mechanism. This means that the process can occupy a maximum amount of CPU within a specific time interval called cpu.cfs_period_us ( typically considered 1/10 of a second or 100,000 microseconds ). The maximum amount of CPU time that the process can consume within this time interval is defined by the cpu.cfs_quota_us parameter.
For example, when it’s stated that the CPU limit is 100m ( or 100/1000 ), it means that within cpu.cfs_period_us , the process can utilize up to 1/10 of cpu.cfs_period_us, which is equivalent to 10,000 microseconds .

When a process attempts to consume more CPU time than this limit within a cpu.cfs_period_us, the CPU restricts it, commonly referred to as throttling. For a clear understanding of this concept, referring to the How container CPU constraints work section in the provided link would be beneficial.

The history of this issue and how the Linux kernel bug was fixed :

As mentioned in the above series, there is a bug in the Linux kernel related to the issue we encountered. However, it was fixed in kernel version 5.4 and beyond. So, why do we encounter this throttling problem now in later versions? Is this something new that hasn’t been previously taken into consideration?

Investigating how CPU limits function on multi-threaded processes :

Considering what’s been discussed, we need to delve deeper into examining CPU limits in multi-threaded processes.

According to the documentation of Cgroups, the configuration of CPU limit for a multi-threaded process entails ensuring that the aggregate CPU time of all threads within the designated time frame cpu.cfs_period_us shouldn’t surpass the limit defined by cpu.cfs_quota_us. Now let’s get a bit more specific with an example:

Suppose we set the CPU limit for a pod to 3, meaning it can utilize a maximum of 3 * cpu.cfs_period_us ( 3/10 seconds )‌ across all available CPU cores within a cpu.cfs_period_us. Now, a process with several threads may reach this maximum allowed value (within cpu.cfs_period_us) more quickly. For instance, when there are 50 threads, all those 50 threads could collectively consume 3 * cpu.cfs_period_us within cpu.cfs_period_us. But if there are only 2 threads, their smaller quantity prevents them from reaching the 3 * cpu.cfs_period_us threshold within the same time frame, thereby avoiding throttling.

The hypothesis is that increasing the number of concurrent threads of a single process accelerates reaching its CPU limit. Consequently, if it reaches its limit within the cpu.cfs_period_us interval, it leads to throttling.

Hypothesis: increasing the number of concurrent threads of a single process accelerates reaching the CPU limit of that process, so the left process is more likely to be throttled with the same CPU limit as the right one.

Assessing the accuracy of the given hypothesis:

To investigate whether increasing concurrently running threads leads to reaching the limit more quickly and consequently causes throttling, we utilize go-cpu-load with different combinations of core counts and CPU usage percentages. In this test, a go-cpu-load pod with varying configurations is deployed on a node with 96 cores available. The key aspects of the configurations I tested were:

  • The pod was assigned resources.requests.cpu: 2 and resources.limits.cpu: 3
  • The number of cores the pod could utilize was adjusted using the — coresCount or -c parameter.
  • The percentage of CPU usage for each of those cores was modified using the — percentage or -p parameter.
Each row of the above table shows a specific test configuration with its throttling percentage as a result.

The throttling percentage is determined by the following formula: throttling percentage = nr_throttled / nr_periods

  • nr_throttled: is the number of runnable periods in which the application used its entire quota and was throttled
  • nr_periods: is the number of periods that any thread in the cgroup was runnable.

( The nr_throttled and nr_periods for a container can be found in the /sys/fs/cgroup/[kubernetes_pod]/cpu.stat file )

Starting with 96 concurrent threads were run across all cores, with each core being utilized at 3%, resulting in a total usage of 2.88%. This led to a throttling percentage of 13% over 700 periods. With a fixed CPU limit and consistent overall CPU usage (because the load per thread increases as the number of threads decreases), the fewer the number of threads, the lower the throttling where in the last experiment, Despite each thread imposing a 96% load on its core, the throttling percentage remains remarkably low at 0.005%, with only 2 occurrences out of 4000 periods.

Conclusion :

Theoretical conclusion: In multi-threaded Golang applications ( and for GOMAXPROCS equal or greater than the CPU limit ) The closer GOMAXPROCS is to the CPU limit, the less throttling occurs. This is because reducing the number of threads decreases the likelihood that the total CPU consumption reaches the maximum allowed within a specified period.

Practical conclusion: Using automaxprocs which automatically sets GOMAXPROCS to match the desired CPU limit.

No throttling occurred after applying automaxprocs
The latency of the methods of our service decreases as the process of that pod is no longer subject to throttling.

--

--

Responses (1)