Kubernetes: Horizontal Pod Scaling

Published in

Google Cloud - Community

6 min readJun 5, 2018

With Pod Autoscaling your Kubernetes Cluster can monitor the load of your existing Pods and determine if we need more Pods or not. This is one of the biggest benefits of using Kubernetes as you will save yourself from overloading individual Pods and which leads to unexpected code behavior and various faults. There are ways for you to control this Pod Autoscaling and best practices around it. That is the purpose of this article.

If you haven’t gone through or even read the first part of this series you might be lost, have questions where the code is, or what was done previously. Remember this assumes you’re using GCP and GKE. I will always provide the code and how to test the code is working as intended.

Kubernetes: Day One

This is the obligatory step one Kubernetes post. If you’re interested in Kubernetes you’ve probably read 100 of these…

medium.com

Pod Autoscaling

As we’ve discussed before, Kubernetes uses Docker containers in a Kubernetes container called a Pod to manage resources in the Kubernetes cluster. With Autoscaling, Kubernetes watches resource metrics of each Pod and determines if we need more or less Pods. I am being vague on purpose by saying “resource metrics” since you can create custom metrics based on your application’s needs. The most common is CPU utilization.

i.e. Kubernetes will watch the average CPU utilization over X seconds and, based on utilization, add or remove Pods. We use the average CPU utilization to reduce the noise from spikes.

Adding Horizontal Pod Autoscaling To Your Cluster

After a Kubernetes Cluster is ready you can add a Horizontal Pod Autoscaler (also referred to as an HPA) so that your Cluster adds and removes Pods as necessary based on resource metrics. Adding this HPA is really simple with the following script line.

echo "sets autoscale logic"
kubectl autoscale deployment endpoints --cpu-percent=50 --min=1 --max=10

Now that we know what an HPA is, we can focus on making our own and testing our Cluster on GCP.

Test With Load

You can be one of two developers right now.

You can push everything out and just trust it work.
You can verify that things are setup properly before you are in trouble for not verifying your work.

That’s right. We are going to be developer number 2 right now and test that our Kubernetes Cluster is scaling as it is supposed to when things start to hit the fan. To do this we will create a whole other Kubernetes Cluster that will simulate load from a totally different GCP Zone. We are going to test in this way because it will adequately simulate load from another location while staying focused on code that we can really effect and not having too many test contaminants.

Note: This solution using Locust is detailed in this post. I just put it into a nice script for you.

First, let’s get our environment running. You can choose to either create autoscaling by default or run two alternative commands to add autoscaling after the fact. If you are curious about this Cluster Scaler I recommend checking out another post I put out on the Kubernetes Cluster Scaler.

$ git clone https://github.com/jonbcampos/kubernetes-series.git
$ cd ~/kubernetes-series/autoscaling/scripts
$ sh startup.sh # with autoscaling
$ # sh startup_wo_autoscaling.sh # without autoscaling
$ # sh add_autoscaling.sh # add autoscaling after creation

Wow those commands took care of a lot didn’t they!? Well now it is time to create our load testing Kubernetes Cluster, build our Docker file that includes our runner tests, and finally deploy our load testing code. To make this easy I put that into one script that you can dive into and see the magic.

You’ll notice that we added an argument to our script so that the load runner knows what address to test.

cd ~/kubernetes-series/autoscaling/scripts # if necessary
# You’ll notice that we added an argument to our 
# script so that the load runner knows what address to test
$ sh startup_load_runner.sh 100.101.102.103

When this process is done you should see a prompt saying where to be able to view access your Cluster. It is time to go to that link and start your load runner.

Available At: [your cluster ip address]:8089

This is the fun part. You can enter how many users you want to simulate and their velocity and then set one Kubernetes Cluster to start attacking another.

As you are playing with the load runner you might want to add a watcher to your pods. This way you can see when the load starts to get to great and the autoscaling begins.

# view specific cluster details
$ gcloud container clusters get-credentials autoscaling-cluster --zone=us-central1-a
# show horizontal pod autoscaling details
$ watch kubectl get hpa # ctrl+c to stop

If you really want to strain your cluster you might want also want to add more workers with the following replication script I’ve set up for you.

$ cd ~/kubernetes-series/autoscaling/scripts # if necessary
$ sh scale_load_runner.sh X # <-- number of replicas to make

With this you’re effectively done! You’ve created a a cluster, you’ve set the cluster to autoscale and finally you’ve tested the scaling on your cluster by hitting it with a load runner. Seriously, it is amazing.

Bonus: If you want to enjoy watching the pods go back down you can clear out the load runner while still watching the hpa.

$ cd ~/kubernetes-series/autoscaling/scripts # if necessary
$ sh teardown_load_runner.sh

Extra Reading: If you are a bit more curious about Locust I recommend looking at my other link where I give some more details about what it takes to edit Locust files.

Locust: Customize Task Runner

This post is a spin off post from another series of posts I just pushed out around Kubernetes: Cluster Autoscaler and…

medium.com

Teardown

Before you leave make sure to cleanup your project so you aren’t charged for the VMs that you’re using to run your cluster. Return to the Cloud Shell and run the teardown script to cleanup your project. This will delete your cluster and the containers that we’ve built.

$ cd ~/kubernetes-series/autoscaling/scripts # if necessary
$ sh teardown.sh

Closing

This post goes hand-in-hand with another post around the Cluster Autoscaler. I would recommend if this peaks your interest in scaling to head that way next.

Kubernetes: Horizontal Pod Scaling

Kubernetes: Day One

This is the obligatory step one Kubernetes post. If you’re interested in Kubernetes you’ve probably read 100 of these…

Pod Autoscaling

Adding Horizontal Pod Autoscaling To Your Cluster

Test With Load

Locust: Customize Task Runner

This post is a spin off post from another series of posts I just pushed out around Kubernetes: Cluster Autoscaler and…

Teardown

Closing

Other Posts In This Series

Kubernetes: Running Background Tasks With Batch-Jobs

When building amazing applications, there are times that you might want to handle an action outside of a user’s…

Kubernetes: Run A Pod Per Node With Daemon Sets

My initial title to this article was just “Daemon Sets” with the assumption that it would be enough to get the point…

Kubernetes: Cron Jobs

Sometimes your work isn’t transactional. Instead of waiting for a user to click a button and have systems light up we…

Kubernetes: DNS Proxy With Services

When building an application it is common that you’ll need to interact with external services to complete your business…

Kubernetes: Routing Internal Services Through FQDN

I remember when I was first getting into Kubernetes. Everything was new and shiny and about scale. As I continued…

Kubernetes: Liveness Checks

Recently I put together a quick article about the Kubernetes Readiness Probe and how important it was for your cluster…

Kubernetes: Day One

This is the obligatory step one Kubernetes post. If you’re interested in Kubernetes you’ve probably read 100 of these…

Kubernetes: Cluster Autoscaler

Autoscaling is a huge (and marketed) feature of Kubernetes. When your site/app/api/project makes it big and the flood…

Kubernetes: Readiness Probe

In case there was any question about this feature, I am writing about it specifically to state that this is not an…

Written by Jonathan Campos