Does your application kill AWS EKS worker node? — Set Node Allocatables

Tarun Prakash
Jul 30 · 3 min read
EKS cluster — Node crash due to overcommitment

Introduction

This article assumes that you have a basic understanding of AWS EKS service or kubernetes in general — what it is, why do we need it, etc. If you are new to kubernetes or EKS service, you can refer official AWS documentation here.

In this blog, we are going to discuss the eks worker node stability issue, what are the factors that affect nodes stability and how that can be taken care by correctly configuring Node Allocatables values in the worker nodes configuration. While the configuration that is shown here is specific to EKS worker nodes but in general Node Allocatables are applied to all the kubernetes clusters.

Problem statement

By default, kubernetes nodes can be scheduled to a capacity. Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS and Kubernetes itself. Unless resources are set aside for these system daemons, pods compete for resources and lead to resource starvation. This usually happens when worker node is running at close to capacity which can potentially lead the worker node to disjoin the cluster.

Set Node Allocatables

We should have some ability to provide more reliable scheduling and minimize node resource overcommitment. Fortunately, kubelet (A service that runs on each kubernetes worker node) exposes a feature named Node Allocatable. Allocatable on a Kubernetes node is defined as the amount of compute resources (MEM, CPU, ephemeral-storage) that are available for pods by reserving system resources to system’s services and services that power kubernetes cluster itself. In short, the scheduler treats Allocatable as the available capacity for pods.

The following picture should depict the relationship between Node Capacity and Allocatable.

Node Allocatable = Node Capacity — (kube-reserved) — (system-reserved) — (eviction-hard)

While Node Allocatable can be set to all kinds of kubernetes cluster nodes but in this tutorial, we are going to cover the configuration specific to EKS cluster.

Configure Node allocatable for EKS worker nodes

AWS provides a bootstrap script that runs on each worker node during its initial boot and helps the worker node register itself with the EKS cluster.

#!/bin/bash -xesudo /etc/eks/bootstrap.sh --apiserver-endpoint 'CLUSTER-ENDPOINT' --b64-cluster-ca 'CERTIFICATE_AUTHORITY_DATA' 'CLUSTER_NAME'

By default, the bootstrap script mentioned above is a part of official AWS EKS worker node AMIs. The above configuration is the bare minimum you need for successfully provisioning the worker nodes.

We can get the following data — CLUSTER-ENDPOINT, CERTIFICATE_AUTHORITY_DATA and CLUSTER-NAME from AWS EKS cluster console.

Update bootstrap script for EKS worker node

Add — kubelet-extra-args to bootstrap script to set Allocatable.

#!/bin/bash -xesudo /etc/eks/bootstrap.sh --apiserver-endpoint 'CLUSTER-ENDPOINT' --b64-cluster-ca 'CERTIFICATE_AUTHORITY_DATA' 'CLUSTER_NAME' \
--kubelet-extra-args "--kube-reserved cpu=500m,memory=1Gi,ephemeral-storage=1Gi --system-reserved cpu=500m,memory=1Gi,ephemeral-storage=1Gi --eviction-hard \
memory.available<0.5Gi,nodefs.available<10%"
Node Allocatables — Arguments

Verifying Allocatables

Once worker nodes are launched with the above changes, Allocatablesshould reflect in the worker nodes configuration as shown in the snapshot given below.

Capacity:cpu:                4ephemeral-storage:  104845292Kihugepages-1Gi:      0hugepages-2Mi:      0memory:             16424204Kipods:               44Allocatable:cpu:                3ephemeral-storage:  92330453652hugepages-1Gi:      0hugepages-2Mi:      0memory:             13815052Ki

For example, t2.xlarge node (4 Cores, 16 GB Memory) was used for this configuration. After reserving 500m CPU for both kube-reserved and system-served the available CPU left for PODs is 3 Cores. You can verify this from the screenshot given above. This output is available via the following command.

kubectl describe node <NODE_NAME>

Summary

We have been using EKS cluster in production for quite some time now and configuring allocatables has been very effective in avoiding node termination due to inadequate resource availability. In this article, we have discussed EKS worker nodes Allocatables configuration and why is it must have configuration from worker nodes stability point of view.

Please let us know your feedback in the comments below also if you have liked the article don’t forget to clap.

MiQ Tech and Analytics

MiQ Tech and Analytics Blog

Tarun Prakash

Written by

DevOps Lead ( Linux | AWS | Docker | Kubernetes | Jenkins | CI/CD | Automation | Terraform | Rancher)

MiQ Tech and Analytics

MiQ Tech and Analytics Blog

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade