Best Practices for Understanding Kubernetes Costs

Published in

The Spot to be in the Cloud

10 min readSep 1, 2022

Kubernetes is the go-to orchestration platform for running large-scale systems of containerized workloads. However, understanding your Kubernetes costs — particularly in a cloud environment — is complex.

Kubernetes does not directly expose the cost of the infrastructure it utilizes but instead exposes the various resources it allocates and uses. Gathering and tracking the costs mapped to these resources is a time-consuming extra step, often delegated to a DevOps engineer (who has many other things to do). In addition, it’s not only the DevOps team that needs this cost information. Providing useful and actionable cost information to teams outside of DevOps makes the task even more burdensome.

Before you can optimize your Kubernetes costs, you’ll need a solid understanding of where those costs come from. You can also take some concrete steps to organize your Kubernetes setup to make cost visibility easier for you and your organization.

In this article, we’ll lay out the relationship in Kubernetes between costs and resource usage. We’ll look at the main areas of Kubernetes that incur costs, examining where those costs come from. Then, we’ll tie it all together with a list of best practices to put in place for ensuring cost visibility.

Let’s start by looking at the key areas in Kubernetes that incur costs.

Understanding cluster costs

Even an empty Kubernetes cluster has costs. The control plane typically runs on multiple nodes for high availability and manages an etcd data store for tracking the cluster’s desired state. Google Kubernetes Engine (GKE) and Amazon’s Elastic Kubernetes Service (EKS) charge $0.10 per hour per cluster which comes to about $70 per month. Azure Kubernetes Service (AKS) currently doesn’t charge separately for the control plane. If you are not using a managed Kubernetes service, then you will be responsible for the costs associated with the control plane nodes.

The control plane cost is fixed and reasonable unless your development process involves creating many clusters (for example, for testing) and leaving them running for extended periods of time.

Understanding compute costs

Compute costs are the costs of the worker nodes of your cluster. This is by far the most complicated and dominant aspect of your cloud costs.

Node costs

There are multiple factors that impact the cost of a running node.

Instance types

Cloud providers offer many instance types with different profiles based on the ratio of performance to price. You need to understand the requirements of your workloads to choose the correct instance types you will use for different workloads. Cloud providers frequently upgrade and update their instance types. This is not a one-time decision. You need to monitor the offerings and be ready to move to better instance types when they become available.

On-demand instances

On-demand (or pay-as-you-go) instances are the typical instance type used for workloads requiring stable nodes that need to run continuously. Note that Kubernetes is designed for situations in which nodes come and go, and it will automatically move your pods to other nodes if a node becomes unavailable or unhealthy. These nodes come in a variety of sizes for each instance type. The price per unit of CPU or memory is the same. Different sizes are just multiples of the base unit. These nodes are scheduled.

Reserved instances

If you commit to purchasing a certain number of instances for a long term (one to three years), you can receive significantly lower prices (60%-80%). One caveat is that you may be stuck with large quantities of outdated instance types. Make sure you can change the instance types of your reserved instances.

Spot/preemptible instances

Many stateless workloads can operate within their SLA even if their nodes disappear every now and then. This is a great opportunity to benefit from spot or preemptible instances. These instances have lower priority than on-demand instances and may be evicted or reclaimed by the cloud provider at any time to accommodate their resource needs. However, this type of instance is significantly less expensive (60%-90%).

Considering discounts

Cloud providers offer various discounts. It’s important to understand how these discounts will impact your costs when planning your cloud spend strategy. If your organization plans to spend millions of dollars in the cloud, then you should be able to negotiate special terms and discounts.

Region pricing

The cloud is not completely consistent and uniform. Different geographic regions may have significant differences in pricing. For example, instances in the GKE us-west2 region are 14%-20% more expensive than those in the us-central1 region.

Node Daemonsets

Kubernetes itself needs to run some software components on each node. These can be deployed as a native Kubernetes object via DaemonSet. Whenever a new node joins the cluster, the DaemonSet pod will be scheduled there. This means that a certain portion of a node’s resources is always taken by the DaemonSets before workloads can utilize it.

Autoscaling

The most prominent aspect of cloud computing is its elasticity. From a user perspective, the cloud has infinite capacity, and you pay for what you request. Kubernetes supports pod autoscaling, which means that if a workload operates at capacity, Kubernetes can create more pods to handle the demand. This requires that the cluster has nodes with available resources to schedule the new pods. If there are no suitable nodes, then the open-source Cluster Autoscaler can add nodes to your cluster.

Of course, autoscaling incurs costs, and you should make sure that you understand the autoscaling behavior of your workloads. You can configure autoscaling at the node pool level, which is a pool of nodes of the same instance type. You can configure the minimum and maximum number of instances. The minimum number of nodes will be always allocated, and you pay for them even if they are empty. The maximum number of nodes will not be exceeded, which is a good way to prevent runaway costs.

Bin packing

Bin packing is the practice of packing pods into nodes. If your nodes are half-empty, then you are paying for resources that you don’t need. There are different mechanisms that can lead to inefficient bin-packing, such as misconfigured scheduling criteria or autoscaling minimums.

Utilization

Utilization is the ratio of resources in use to the total amount of requested resources. For example, let’s consider a pod that requests four CPU cores while only using two CPU cores at runtime. Bin packing can be 100% efficient, with Kubernetes allocating the pod to a node with four CPU cores. However, you pay for four CPUs when you need only two. If the pod requested only two CPUs, then you could fit two pods into each of the four CPU nodes and save 50%.

Node-workload shape

Node-workload shape is the ratio between CPU and memory requests. For example, if a node has two CPUs and 40 GiB memory, then it can fit exactly one pod that requests two CPUs and 10 GiB memory. In this case, there is a misalignment in the shape and you pay for 30Gibs of unused memory. There are different strategies to minimize the impact of shape misalignment from multi-tenancy to dedicated node pools.

Understanding Networking Costs

Large-scale systems are often deployed across multiple Kubernetes clusters, geographical regions, and availability zones (AZ). The workloads running in these various locations often need to communicate with one another and to other systems, as well as expose APIs and access data. There are multiple types of costs associated with cloud networking.

Cross-AZ and cross-region costs

There are different ways to connect networks. Transferring data within the same data center (that is, in the same AZ) is typically free, but cross-AZ and cross-region traffic within the same cloud will cost you. Traffic between different regions can have different costs. It’s important to understand these costs in order to design the network topology of your system.

Ingress costs in the cloud

Kubernetes clusters in the cloud interact with other systems, including Kubernetes clusters on other clouds as well as end users. Ingress (getting data into the cloud) is free because cloud providers prefer that you manage your data in their cloud. However, there are costs associated with network components that process ingress traffic. These components include the NAT gateway, VPN gateways, and load balancers.

Egress costs in the cloud

Egress traffic (sending data out of the cloud) will cost you a lot. This is by design. More recently, cloud providers are being pressured to pass on some savings to customers. AWS announced recently that the first 100GB of egress per month is free. This may be significant for smaller companies. For organizations that use the cloud at an enterprise scale, this might not make a difference.

There are different solutions with different pricing profiles to consider for getting a lot of data out of the cloud from CDNs to storage appliances.

Service mesh costs

The major networking cost drivers are typically cross-region and egress traffic. However, if you use a service mesh for connectivity within your Kubernetes cluster or among a set of clusters, then there are costs you should be mindful of.

Service meshes usually attach a sidecar container to every pod. These sidecar containers know about all the other pods in the mesh. Each sidecar container requires CPU and memory, which will drive up your costs. The sidecar container uses the memory to store the global mesh configuration (all of the other sidecar containers in the mesh) to know where to send traffic. Whenever new pods come and go, all the service mesh sidecar containers need to be updated. The resources needed to handle these updates — especially for large meshes — are significant, yielding additional costs. Of course, when using small pods that use little CPU and memory, the resources used by the sidecar containers can be dominant.

Network policy costs

Kubernetes has a lot of optional components that an administrator can choose to enable or disable. When network policies are enabled, Kubernetes installs an agent on every node via DaemonSet, and that agent monitors the network traffic to enforce policies at the network level. Each agent, therefore, requires resources (CPU and memory). The additional cost of these resources is proportional to the cluster size. In large clusters with many small nodes, the network policy agent can consume a significant portion of the node resources, leaving fewer resources available to the application and its workloads.

Understanding storage costs

Storage is another core piece of infrastructure. Kubernetes provides abstraction layers in the form of storage classes, persistent volumes (PV), and persistent volume claims (PVC). However, as far as costs go, there isn’t much that impacts the cost at the Kubernetes level.

Kubernetes stores its own state in etcd. Developers may be tempted to use etcd to store some of their application data alongside the cluster state, but this is a dangerous anti-pattern. When provisioning nodes, you should also consider the size and type of the local storage (local spinning disks, local SSDs, and so on).

In general, storage is relatively inexpensive when compared to compute or networking. You should consider the spectrum of storage options and find the mix that works for you from expensive in-memory caches all the way to inexpensive slow archives.

Best practices: concrete steps

Understanding and managing costs on Kubernetes is not trivial. First, Kubernetes doesn’t directly capture the costs of the system. Second, Kubernetes is designed to support autoscaling and auto-healing dynamic systems, which makes it difficult to control costs. Let’s look at some best practices that can help.

Cost allocation and namespaces mapped to teams

Ensure you can associate costs with specific teams. This is critical for cost visibility and for working with teams to manage the costs they are responsible for.

Kubernetes supports namespaces that contain resources. Administrators can control resource usage in a cluster by defining resource quotas on a per-namespace basis. By assigning teams to namespaces, it’s easy to map costs per namespaces to the corresponding team. If namespaces alone are not sufficient, then use labels in conjunction.

Resource labeling

Every Kubernetes resource has a metadata section that includes labels, which are key-value pairs. Assign labels for accounting purposes to keep track of changes in costs that are related to changes in resources.

Resource requests and limits

Containers in pods can specify requests and limits for resources such as CPU and memory. A pod is scheduled on a node only if the total of its container requests can be satisfied, making the requested resources available for that pod.

The limit is a promise the container makes as to the quantity of resources it will use. If a container exceeds its CPU limit, then it may be throttled. If it exceeds its memory limits, then it may be terminated by the out of memory (OOM) killer.

Utilize cloud provider cost tracking and reporting

Although Kubernetes itself is not cost-aware, cloud providers offer excellent cost tracking and reporting facilities that allow you to slice and dice your cost data. When you practice proper resource labeling, you can match changes in Kubernetes resources to cost at a very granular level (for example, by region, environment, team, day of the week, or even time of day).

Efficiency calculation

Calculate the efficiency of the resources you use. Empty, idle, or underutilized nodes and other unused resources are low-hanging fruit that you can detect and eliminate.

Anomaly detection

Kubernetes makes it easy to scale your resources. Make sure to have proper controls in the form of resource quotas, limits, and alerts that notify you before you run into a wall. This is important both for general operation as well as for preventing spikes in cost.

Infrastructure as code

Among the many benefits that come with infrastructure as code is the window of time you have to review the infrastructure changes that can impact code. Make cost-awareness part of the review process to catch early, unjustified infrastructure changes that can significantly increase your cost.

Governance and audit

Carefully consider who can make changes to your infrastructure that can impact the bottom line. Minimizing access to modify infrastructure will help you to avoid runaway costs as well as narrow down the culprit when investigating unexpected changes. Make sure to enable and tune Kubernetes auditing. Your cloud provider may provide additional tooling to make this easier, or even have it configured by default.

Conclusion

Kubernetes is a sophisticated orchestration platform. It is very suitable for large-scale distributed systems, possibly across multiple cloud providers, regions, and clusters. It is not simple to understand the considerable costs that such systems accrue. It takes expertise, experience, and attention to detail to make correct tradeoffs and ensure you don’t leave a lot of money on the table.