Kubernetes Patterns, like Design Patterns, abstracts Kubernetes primitives into some repeatable solutions to solving problems. Here is the previous post for an introduction of Kubernetes Patterns in details.
What is Predictable Demands Pattern?
Predictable Demands pattern is about how the application requirements should be declared.
The requirement for a container running in Kubernetes are mostly including runtime dependencies (like file storage) and resource profiles (CPU, Memory, etc).
File storage is one of the typical runtime dependencies of applications for saving states. In a container, the file system is not persistent and the files are lost when the container is shut down or restarted.
Compute resources in the context of Kubernetes are defined as something that can be requested by, allocated to, and consumed from a container.
There are two categories of resources:
- Compressible resources, such as CPU: the container could be throttled or degraded when consuming too many of compressible resources, like CPU. It would impact the performance or functions of the application.
- Incompressible resources, such as memory: the container must be killed if consuming too many incompressible resources, like memory. For example, if the container tries to use the memory more than the limit, it gets killed with OutOfMemory exception.
Why Use It?
As a developer of cloud native applications, it is important to understand and know the runtime requirements for a container running your application.
Leverage Kubernetes Orchestration Feature
Well defined runtime dependencies and declared resource demands make Kubernetes decide wisely where to place a container in the cluster with most efficient hardware utilization.
Capacity planning is another reason since it is based on resource profiles of containers. It is important for managing and organizing Kubernetes clusters efficiently.
How to Use it?
How to apply Predictable Demands pattern in the Kubernetes environments? There is a little difference to be declared between runtime dependencies and resource profiles.
As mentioned before, the life cycle of the container filesystem is same as the container itself. Though, Kubernetes provides a Pod-level storage: volume, which has the same lifecycle with the Pod.
Keep in mind, a Pod is possible to include multiple containers that will be introduced in other Kubernetes Patterns, for example, Init Container pattern and Sidecar Pattern, etc.
The most straightforward type of volume is emptyDir, which lives as long as the Pod lives. Even though the the container inside the Pod gets crashed or restarted, the data in an emptyDir is safe.
Note: A container crashing does not remove a Pod from a node.
Here is an example of how to use emptyDir in a Pod definition.
emptyDir can be used for caching data or sharing data between containers inside a Pod.
If you want a volume with a longer lifecycle (maybe longer than Pods), PersistentVolume is the one in Kubernetes.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes.
This is an example of PersistentVolumeClaim (PVC).
The scheduler evaluates the kind of volume a Pod requires, which affects where the Pod gets placed. If the Pod needs a volume that is not provided by any node on the cluster, the Pod is not scheduled at all.
ConfigMap is a type of configurations, which is also a kind of runtime dependencies.
It will be well explained in the Kubernetes pattern of Configuration Resource.
Secret is similar with ConfigMap but provides a more secure way to store configurations.
The usage of Secret is exactly same with ConfigMap.
Compute resources are also a critical point in Kubernetes which can be requested by, allocated to and consumed from a container. As the application developer, you have to specify the minimum amount of resources that are needed (called requests) and the maximum amount it can grow up to (the limits) for your containers.
The most common resources to specify are CPU and memory (RAM).
- CPU represents compute processing and is specified in units of Kubernetes CPUs.
1means “one vCPU/Core”.
0.1is equal to
100mwhich means “a hundred millicpu” or “one hundred millicores”.
- Memory is specified in units of bytes, like 65536, 64K or 64Ki, etc.
The Kubernetes scheduler uses specified request amount to decide which node to place the Pod on.
The kubelet enforces the limit of a container. The container is not allowed to use resources more than the limit.
From the example above, it specifies the requests and limits for compute resources (CPU & memory). There are several kinds of QoS of Pods based on the requests and limits settings of the containers inside it.
- Best-Effort, no requests or limits set for the containers in a Pod; the Pod has lowest priority which means it is killed first when the node runs out of incompressible resources.
- Burstable, the value of requests is not equal to limits for container in the Pod which causes it likely killed if no Best-Effort Pods remain.
- Guaranteed, the containers have equal amount of request and limit resources, which leave the Pod the highest-priority comparing with Best-Effort and Burstable Pods.
The QoS of Pods is maintained by kubelet which is different with the Pod priority below.
A PriorityClass was introduced in Kubernetes v1.14, which indicates a weight of a Pod comparable with other Pods and affects the order of Pod scheduling.
When a new Pod is created per the definition example above, the priority value is populated to it. If multiple Pods are waiting to be scheduled, the Pods with the highest priority value should be placed first.
Pod Priority vs. Pod QoS
- For Kubelet on each Node, it respects Pod QoS first and then the priority value.
- For Kubernete scheduler, it only cares about Pod priority values.
When to Use?
As a foundational pattern, it is import for a cloud-native application to comply it all the time by identifying and declaring resource requirements and runtime dependencies, which make the application a good cloud-native citizen.
We know how a container or a Pot constrains the resource usage by itself with resource profile. But for a Kubernetes administrator, how to control the Pods with no requests or limits set?
There are two ways to apply constrains of resource usage for all the Pods or containers in a namespace:
- ResourceQuotas, is used for limiting total usage of compute resources per namespace.
- LimitRange, is to set usage limit based on each resource type for each namespace.
Capacity planning is much more important in production Kubernetes clusters because we always want to keep the production environment as stable as possible.
Therefore, we prefer most containers in the production environment Guaranteed and some Burstable. If a container gets killed, that is most likely a sign that the capacity of the cluster should be increased.
In my working practices, the Cloud Infra team requires all application owners or teams provides the request and limit of the application deployed in our Kubernetes clusters.
For some applications, they might need extremely high CPU or memory in the runtime. For example, some computing jobs are CPU consuming while some caching jobs cost more memories.
Our strategy is to provide some special Nodes with either more powerful CPUs or high amount of memory for them. To guarantee those applications allocated to the right Node is a different story, which is elaborated in the Kubernetes Pattern: Automated Placement. Please stay tuned for the following posts.