Why am I so scaled in EKS?

Chris Farmer
Just Eat Takeaway-tech
4 min readJun 3, 2024

📖 Background

We’ve had one of our services deployed in AWS as an EKS application for nearly a year now. EKS is one of Amazon’s managed Kubernetes offerings which you can read more about here. In spite of a few bumps in the road, it’s been a huge improvement over the virtual server (Amazon EC2) pipeline we had previously, with simpler and faster deployments.

However, we were recently alerted to the number of pods in our deployment being pinned at our max of 10.

📈 Why were we so scaled?

From looking at the custom dashboard we’d created for our application, there was no obvious reason for why we were scaled out. All metrics looked well within their configured request values, except for the Istio Proxy sidecar containers which were, in some cases, over 100% utilisation of their requested memory.

😱 How did it happen?

This we’re not 100% sure on, but the assumption is that increased traffic pushed Istio over the default configuration for memory of 128 MiB. The reason Istio was able to function so far above its requested memory was due to the default memory limit for the Istio container being 1 GiB. This is 800% of the requested memory and certainly counter to our internal best practice.

😌 How was it resolved?

Istio has a list of annotations it supports for overriding configuration, which allows us to modify the memory requests and limits. On our first iteration, we used those annotations to raise that lower bound and align ourselves with our internal recommendation that the limit and request values should match. For more info, have a look at this super informative article by Natan Yellin — What Everyone Should Know About Kubernetes Memory Limits, OOMKilled Pods, and Pizza Parties | Robusta.

deployment:
annotations:
sidecar.istio.io/proxyMemory: "256 Mi"
sidecar.istio.io/proxyMemoryLimit: "256 Mi"
...

After deploying this change, the memory utilisation looked better straight away, but we were still seeing our pods pinned at the max configured replicas.

It turns out that by adding these annotations, we had inadvertently lost all globally configured defaults. These included the CPU request configuration for Istio, and the loss of this had resulted in the scaling behaviour we were seeing.

By adding another annotation to restore the value back to the global default and deploying, we finally saw our deployments scaling into our configured minimum.

deployment:
annotations:
...
sidecar.istio.io/proxyCPU: "100m"
...

🩺 What does healthy look like?

The results of the changes slowly became apparent.

The algorithm that governs the Horizontal Pod Autoscaler (HPA) is described as the following:

From the most basic perspective, the HorizontalPodAutoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ciel[currentReplicas * ( currentMetricValue / desiredMetricValue )]

For example, if the current metric value is 200m, and the desired value is 100m, the number of replicas will be doubled, since 200.0 / 100.0 == 2.0 If the current value is instead 50m, you'll halve the number of replicas, since 50.0 / 100.0 == 0.5. The control plane skips any scaling action if the ratio is sufficiently close to 1.0 (within a globally-configurable tolerance, 0.1 by default).

As mentioned above, there is a 10% tolerance on the ratio between current and desired metric meaning that, in our case, with the target average usage being 60%, scaling-in would happen at 54% utilisation. There is also a cooldown period of 5 minutes when scaling-in.

Note that the Horizontal Pod Autoscaler takes into account the requests of all containers within the pod, which is important to understand since Istio Proxy is injected into some pods and will bring its own requests (and limits) which will affect the above calculation. In Kubernetes 1.27+ it will be possible to scale based upon container resource utilisation, so keep an eye out for the upgrade! (More details: Kubernetes 1.27: HorizontalPodAutoscaler ContainerResource type metric moves to beta | Kubernetes)

We were finally down to 2 pods (our minimum configured replicas) with both Istio and our primary application looking much more comfortable in their newly configured bounds.

A bit more about Istio Proxy

Requests and limits

Istio Proxy deviates from the recommendation to set memory requests and limits to the same so that we can be more efficient with resource utilisation in the cluster overall. There are a variety of different workload profiles which Istio Proxy will be injected into. It is practically impossible to set a generic memory request and limit for Istio Proxy that works well for all workloads that still maximises resource utilisation on nodes. It is worth noting, therefore, that metrics should be carefully monitored after overriding the defaults to ensure memory utilisation does not become an issue.

Why we use it

  • Improved observability: more insight is available for traffic flowing into and out of pods through metrics, traces and logs.
  • Extra traffic management capabilities
  • Extra security through mutual TLS, OIDC integrations etc

Therefore, consider injecting Istio Proxy if the above benefits of a service mesh are not required, and the service is well covered by other telemetry data, as this will simplify the configuration of requests and limits.

For worker services (such as those dedicated to message processing), there’s no need to configure Istio since there is no API traffic flowing to the pods.

Want to come work with us at Just Eat Takeaway.com? Check out our open roles.

--

--