Ben Boral
Ben Boral
Feb 8, 2017 · 3 min read

My group at RetailMeNot is experimenting with Kubernetes for container management, and I recently spent a day pushing pods to their limits.

Memory Test

To test what happens when a pod’s memory gets maxed out, we ran a simple python program:

***Max-Out Memory***giant_string = ''
while True:
with open('long_file.txt', 'r') as f:
for line in f:

The program reads a very large text file and continuously appends it to a growing string. The outcome we observed was that memory consumption grew until the node transitioned to a “Node Not Ready” state. The offending pod and its neighbor pods on that node were then rescheduled elsewhere.

The idea of a rogue pod knocking out a node from a cluster is not so nice. To avoid this, we can add a resource memory limit to each Kubernetes Deployment.

***Deployment Configuration YAML***apiVersion: extensions/v1beta1
kind: Deployment
memory: {nodeMemoryLimit} # for example, 1Gi

When we reran the same test with a memory limit, we observed that when the memory limit was hit, the offending pod was killed, rescheduled, and restarted. The other pods on the node were safe.

The graph below shows memory consumption by our node. You see two spikes on the graph. The first spike shows when we started up our hungry pod. The valley shows when our hungry pod got killed by Kubernetes, and the second spike shows how our pod was immediately restarted and began hogging memory again.

CPU Test

To test maxing out CPU in a pod, we load tested a website whose performance is CPU bound. In our load test, the CPU for the entire node got pegged to 100%. However, the node did not fail, it just got slow.

To overcome this, one option would be to use the same strategy as our memory problem: set a cpu resource limit in the Deployment configuration. The problem with this strategy is that Kubernetes would kill the offending pod and the pod’s resident application would be stopped — a potentially worse outcome than a slow application.

Luckily, Kubernetes offers horizontal pod autoscaling based on CPU consumption.

To take advantage of this, we add a HorizontalPodAutoscaler to our Kubernetes configuration.

***Horizontal Pod Autoscaling (HPA) Configuration YAML***apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
name: {projectName}
kind: Deployment
name: {projectName}
minReplicas: {minReplicas} # for example, 1
maxReplicas: {maxReplicas} # for example, 2
targetCPUUtilizationPercentage: {cpuScalingThreshold} # a percentage, for example 80

When we reran the load test using horiztonal pod autoscaling, we saw that when the CPU threshold was crossed, Kubernetes scheduled additional pods across our cluster.

Concluding Thoughts

It is great that Kubernetes HPA allows us to scale based on CPU usage. In the future, we would love to see Kubernetes offer autoscaling based on memory consumption (not yet available at the time of writing). That has been discussed in the Kubernetes community. In its absence, we will use memory resource limits in our Deployments to ensure that a rogue pod doesn’t take down its node.

It’s also important understand how your application will fail. One way to do this is by running a simple load test and observing which resource spikes (CPU, memory, other something else). With that knowledge, you can decide if horizontal pod autoscaling will work for you.

RetailMeNot Engineering

Saving The World Money Since ‘09

Thanks to Tom Dickman

    Ben Boral

    Written by

    Ben Boral

    Software Engineer at RetailMeNot Inc. working on mobile and web applications.

    RetailMeNot Engineering

    Saving The World Money Since ‘09

    Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
    Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
    Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade