GKE Autopilot cost efficiency

Sam Gallagher
4 min readMar 2, 2023

--

oooh! Dalle2 does nice geometric designs as well.

What if I told you that GKE Autopilot is a more cost-effective solution than GKE Standard? You might tell me that I’m wrong, I certainly thought so. I wanted to check the math and give it some thought to see if that was actually the case.

GKE Pricing System

I want to briefly touch on the pricing differences between GKE Autopilot vs. Standard to provide some background on this article.

GKE Standard is your classic managed Kubernetes platform. Google manages the control plane, invisible to the operators, and the operator provisions Google Compute Engine instances to run the Kubernetes workloads.

GKE Autopilot is a little different. Google still manages the control plane, but operators define resource requirements (think CPU, Memory, Disk) for each pod, instead of defining a compute engine instance. This makes it a “serverless” Kubernetes implementation as the operators do not need to manage any VMs.

Cost Differences

On paper, GKE Autopilot is more expensive than GKE Standard without a doubt. After building a simplistic spreadsheet we are shown that when comparing the two platforms Autopilot is more expensive by 191%.

This directly compares Compute Engine instance types to Autopilot resource equivalents. As an example, e2-standard-2 instances have 2 vCPU and 8GB memory, so this is compared with the pricing of 2vCPU and 8GB memory on Autopilot.

Autopilot is 191% higher than matching compute engine instances

That’s a big difference. Big enough to make us shy away from Autopilot in hopes that GCP eventually lowers the price. I mean, it’s one of those new fancy “serverless” options that doesn’t need to exist. Let’s just continue to manage our Node Pools, right? But, maybe wrong.

This type of comparison isn’t actually helpful. Instead, we need to look at an actual real-world example of a GKE deployment and calculate the difference.

Simplistic Workload Example

Let’s create a deployment scenario and compare the cost of running it in a highly-available GKE Standard cluster vs. running it on GKE Autopilot.

This workload will consist of a single application that we want 3 pods running, for high availability. This application requires minimal resources, 0.5 vCPU, and 2GB of Memory.

GKE Standard

The GKE Standard cluster will need to consist of 3 e2-standard-2 nodes, each in different zones. The Kubernetes control plane will schedule each pod on each node, to maximize uptime.

Based on the resource requirements we have defined in this scenario, 25% of vCPU and Memory will be used by each node for the workload.

Because we are using e2-standard-2 nodes, the Google Compute Engine pricing will apply and each node will cost $1.86 per day. This culminates in a total cost of $5.58 per day for all 3 nodes.

GKE Autopilot

The GKE Autopilot cluster will deploy the 3 pods with the required vCPU and Memory allocated per pod, and Google will provide nodes for it to run on.

For Autopilot pricing we will be charged per pod based on their resource requests. For 0.5 vCPU and 2GB of Memory, this price is $0.89 per day. This culminates in a total cost of $2.67 per day for all 3 pods.

Verdict

In this simplistic scenario, Autopilot is only 48% of the Standard cluster cost. While this isn’t the best example, it does show that Autopilot's cost efficiency is connected to the shape of the workload we are deploying.

“Shape” in the above phrase refers to the amount of resources for each pod, and the number of nodes the pods are scheduled on.

Break-even point

The next natural question is: “What percentage of standard node resources need to be consumed by my workload to make it more cost-effective than Autopilot?”.

The quick answer is 53.5% CPU and 50% Memory.

As node utilization increases, the cost efficiency of Autopilot gets worse

Digging into the above graph — we are charting the cost of running pods on a single node and seeing how the cost efficiency of Autopilot (orange) vs. a Standard node (yellow) behaves as the utilization (blue) of a single node increases.

The cost of the Standard node will always remain constant, we will pay $1.86 per day no matter what workloads we have scheduled on the node.

The cost of Autopilot grows as we increase the resource requirements of the workload. It starts smaller than the cost of the Standard node, but after 50% node utilization it grows to be more expensive.

In Practice

Kubernetes best practice states that each node should have available resources to handle pods that fail-over from another non-responsive node. This provides Kubernetes the ability to move pods from one node to another if there are issues and gives the operator peace of mind that their pods will always (or almost always) be available.

With this in mind, it’s not typical to see a node sitting at 50% utilization or higher, as this free space needs to be built into the resource plan to provide space for a fail-over.

In GKE Autopilot we do not have this constraint. If one of Google's managed nodes were to fail, Google would automatically reschedule the pod to another available managed node.

With this philosophy in mind, I can fairly reasonably say that, in certain circumstances, GKE Autopilot will be more cost-effective than GKE Standard.

Resources

--

--