Optimise Costs for Google Kubernetes Engine: Google Cloud Challenge Lab Walkthrough

Published in

Google Cloud - Community

9 min readJul 30, 2024

This is a walkthrough of the challenge lab from the course Optimise Costs for Google Kubernetes Engine. I’ll show you how to complete the lab, but also help explain concepts along the way.

In this lab we will optimise a Google Kubernetes Engine cluster, looking for ways to reduce costs whilst ensuring continued performance and availability.

The lab tests your ability to:

Create a GKE cluster.
Create namespaces.
Create a new nodepool, and migrate workloads from an existing nodepool.
Update a deployment.
Creating a pod disruption budget (PDB).
Configure the horizontal pod autoscaler and cluster autoscaler.
Using the locust open source load generator.

Intro to Challenge Labs

Google provides an online learning platform called Google Cloud Skills Boost, formerly known as QwikLabs. On this platform, you can follow training courses aligned to learning paths, to particular products, or for particular solutions.

One type of learning experience on this platform is called a quest. This is where you complete a number of guided hands-on labs, and then finally complete a Challenge Lab. The challenge lab differs from the other labs in that goals are specified, but very little guidance on how to achieve the goals is given.

I occasionally create walkthroughs of these challenge labs. The goal is not to help you cheat your way through the challenge labs! But rather:

To show you what I believe to be an ideal route through the lab.
To help you with particular gotchas or blockers that are preventing you from completing the lab on your own.

If you’re looking for help with this challenge lab, then you’ve come to the right place. But I strongly urge you to work your way through the quest first, and to try the lab on your own, before reading further!

With all these labs, there are always many ways to go about solving the problem. I generally like to solve them using the Cloud Shell, since I can then document a more repeatable and programmatic approach. But of course, you can use the Cloud Console too.

An Overview of Autoscaling in GKE

In GKE, we have the concept of 4-way autoscaling. I.e. we can scale both workloads and infrastructure, and for both we can scale horizontally (i.e. scaling out and in) and we can scale vertically (i.e. scaling up and down).

These are the components that are available for us for 4-way autoscaling:

4-way autoscaling (from Google’s documentation)

Workload Autoscaling

The HPA adds or removes pods in response to metrics, such as CPU utilisation. It is the fastest way to autoscale and is a good way to respoind to sudden spikes of demand.
The VPA will adjust the size of newly created pods, to meet demand.
We should set a pod disruption budget to control how many pods can be replaced at a given time, and to minimise downtime.

Infrastructure Autoscaling

Of course, there’s no point setting up workload autoscaling if we don’t have sufficient infrastructure to allow workload autoscaling to execute. And, on the flip side, we don’t want our infrastructure overprovisioned, and unable to scale back.

The Cluster Autoscaler (CA) adds or removes nodes, as required. If additional pods can’t be scheduled in existing nodes, the CA will create new nodes. And consequently, we should always run the CA alongside the HPA.
Node Autoprovisioning (NAP) automatically provisions new nodepools, using compute instances that are sized to meet the demand of the workloads. Without NAP, the CA is only able to provision new nodes using existing nodepools, and this can be wasteful. Note that NAP takes a relatively long time, compared to CA.

Mitigating Autoscaling Latency

If we’re worried that our spikes in demand might be very rapid and that we can’t autoscale our infrastructure fast enough, then there are strategies we can employ to mitigate the autoscaling latency. For example, we can use pause pods, which are low priority pods that can be replaced by higher priority pods. The pause pods will force the creation of additional nodes, if there is insufficient existing capacity. In this way, they help us to scale “ahead”.

My Solution to the Lab

Initial Setup

Let’s start by defining some variables we can use throughout this challenge. The actual variables will be provided to you when you start the lab.

gcloud auth list

PRJ=$DEVSHELL_PROJECT_ID
ZONE=<ENTER ZONE>
CLUSTER=<ENTER CLUSTER NAME>
MACH_TYPE=e2-standard-2
REL_CHANNEL=rapid
ENV_DEV=dev
ENV_PRD=prod

# Set the zone, so we don't have to keep specifying in our kubectl commands
gcloud config set compute/zone ${ZONE}

Task 1 — Create a Cluster and Deploy Your App

First we’re asked to create a GKE cluster, with the specified name. We’re told to:

Create a zonal cluster.
With only 2 nodes.

We’re also told to start with the e2-standard-2 machine type, which has 2 vCPUs and 8GB of RAM.

We’ll use the gcloud container clusters create command to create our cluster. Check the command reference here for specific syntax.

# Create the zonal cluster - this will take a few minutes
gcloud container clusters create ${CLUSTER} \
  --num-nodes=2 \
  --machine-type=${MACH_TYPE} \
  --release-channel=${REL_CHANNEL} \
  --enable-vertical-pod-autoscaling # just in case we need it!

Eventually, the cluster is ready to go:

Now we need to create two namespaces, for the dev and prod environments.

kubectl create namespace ${ENV_DEV}
kubectl create namespace ${ENV_PRD}

We’re given the command to download the OnlineBoutique application, and to deploy it to the dev namespace. Here we apply a pre-created manifest.

# Clone the application
git clone https://github.com/GoogleCloudPlatform/microservices-demo.git

# Deploy to the Dev namespace
cd microservices-demo
kubectl apply -f ./release/kubernetes-manifests.yaml --namespace dev

The result looks like this:

Let’s have a look at what we’ve deployed:

# Look at the nodes
kubectl get nodes

Our nodes

# Look at the namespaces
kubectl get namespace

# set our namespace to DEV, so we don't have to specify with each kubectl command
kubectl config set-context --current --namespace=${ENV_DEV}

# Look at the pods
kubectl get pods

# inspect the deployment
kubectl get deployment

We can also take a look in the Cloud Console:

Task 2 — Migrate to an Optimised Node Pool

We’re now going to create a new node pool. The instructions show us that the current machines are wasteful of CPU and RAM, and hint that we can use a smaller machine type. We’re currently using about 2.7 CPUs from our available 4 CPUs. And we’re using about 2.55GB of about 12GB.

We’re supplied with a name for our new node pool, a custom machine type to use, and the number of nodes required.


DEFAULT_POOL=default-pool
NEW_POOL=<ENTER POOL NAME>
CUSTOM_TYPE=custom-2-3584

# Create new node pool with e2-standard-2
gcloud container node-pools create ${NEW_POOL} \
  --cluster=${CLUSTER} \
  --machine-type=${CUSTOM_TYPE} \
  --num-nodes=2

After a couple of minutes, the new node pool has been created:

Now we’re going to migrate our application to the new node pool. To do this, we have to cordon off the default-pool node pool, drain it, and migrate our workloads to the new node pool

# Cordon off the existing node pool, to stop new nodes being scheduled to it
for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=${DEFAULT_POOL} -o=name); do
  kubectl cordon "$node";
done

# Drain the existing node pool
for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=${DEFAULT_POOL} -o=name); do
  kubectl drain --force --ignore-daemonsets --delete-local-data --grace-period=10 "$node";
done

We can see the pods being evicted. They will automatically be scheduled onto the new node pool, since the old pool has been cordoned off.

# Delete the old node pool
gcloud container node-pools delete ${DEFAULT_POOL} \
  --cluster ${CLUSTER}

It takes a couple of minutes to delete the old node pool.

Task 3 — Apply a Frontend Update

We need to do rolling update to our frontend deployment, and we’re told we need to minimise disruption. To achieve this, we set a pod disruption budget, so that one pod must always be available.

PDB=onlineboutique-frontend-pdb

kubectl create poddisruptionbudget ${PDB} \
  --selector=run=frontend --min-available 1

We’re told to edit our frontend to point to:

gcr.io/qwiklabs-resources/onlineboutique-frontend:v2.1

Let’s edit our file ./release/kubernetes-manifests.yaml and update our image. We can do this from the Cloud Shell Editor:

Edit kubernetes-manifests.yaml in the Cloud Shell Editor

Whilst you’re there, don’t forget to change the ImagePullPolicy as requested in the instructions. (This forces the deployment to always refresh from the registry.)

Now we can redeploy the manifests:

kubectl apply -f ./release/kubernetes-manifests.yaml --namespace dev

Okay, so far, so good!

Task 4 — Autoscale from Estimated Traffic

We need to plan for a large spike in traffic. We’re told to setup the horizontal pod autoscaler (HPA) as follows:

CPU target of 50%
Between 1 and <specified> number of replicas.

MAX_REPLICAS=<ENTER VALUE>

# Setup the HPA, to maintain 50% CPU across all pods
kubectl autoscale deployment frontend \
  --cpu-percent=50 --min=1 --max=${MAX_REPLICAS}

# check HPA status
kubectl get hpa

HPA configured

We also need to configure the cluster autoscaler, so that we can provision and destroy nodes as necessary, to accommodate our autoscaling pods. We’re told to set up the cluster autoscaler to scale between 1 and 6 nodes, inclusive.

# Enable cluster (node) autoscaling
gcloud beta container clusters update ${CLUSTER} \
  --enable-autoscaling --min-nodes 1 --max-nodes 6

Now we’ll run a load test to simulate the traffic surge and test our load balancing. We need the external IP address of our frontend-external service:

Now we’ll generate some load using the Locust open source load testing framework. If you haven’t used it before, you should check it out. It’s very cool. You can send the parameters with the command line (as we’ll do here), but it also has a very sophisticated GUI.

Here, we’re given a command to load test with Locust:

FRONTEND_EXTERNAL_IP=<enter IP>
kubectl exec $(kubectl get pod --namespace=dev | grep 'loadgenerator' | cut -f1 -d ' ') -it --namespace=dev -- bash -c 'export USERS=8000; locust --host="http://${FRONTEND_EXTERNAL_IP}" --headless -u "8000" 2>&1'

And that’s all we have to do! Lab done!

Useful Links

Before You Go

Please share this with anyone that you think will be interested. It might help them, and it really helps me!
Please give me claps! You know you clap more than once, right?
Feel free to leave a comment 💬.
Follow and subscribe, so you don’t miss my content. Go to my Profile Page, and click on these icons: