Karpenter Mastery: NodePools & NodeClasses for Workload Nirvana

Gajanan Chandgadkar
8 min readFeb 13, 2024

--

In our previous post, we explored the magic of Karpenter, the innovative autoscaler for your EKS clusters. In this post, we dive into the World of NodePools and NodeClasses.

Imagine your EKS cluster as a bustling city, teeming with diverse residents — your pods. But managing them all efficiently requires organization and structure. Enter NodePools and NodeClasses, your powerful tools for creating neighborhoods within your cluster, each tailored to specific needs.

NodePools: Your Neighborhood Managers

Think of NodePools as dedicated neighborhoods responsible for provisioning and managing groups of worker nodes. Each neighborhood has its own distinct rules and characteristics.

NodeClasses: Your Neighborhood Blueprints

Now, imagine having blueprints for your neighborhoods, defining their underlying infrastructure and settings. That’s where NodeClasses come in. Each NodeClass acts as a reusable blueprint

Ready to Go Beyond the Basics? Let’s dive headfirst into deploying a NodePool, NodeClass, and a sample deployment to unpack the whole workflow!

Before we embark on this adventure, make sure you have the following prerequisites:

1. A Running EKS Cluster: You’ll need a functioning Amazon EKS cluster ready to accommodate your new NodePools.

2. Karpenter Installed: Following the official “Getting Started with Karpenter” guide (Getting Started Guide), ensure Karpenter is up and running within your EKS cluster.

Refer the files in this repo for testing!

Karpenter NodePools: Orchestrating Your Cost-Optimized Spot Fleet:

  • Think “worker pools”: NodePools act like pools of worker nodes with similar characteristics and configurations.
  • Dynamic provisioning: They automatically provision and deprovision nodes based on your cluster’s needs and your defined settings.
  • Cost-effective Spot Instances: Your NodePool can target Spot Instances for cost optimization, using specific instance categories or flexible choices within those categories.
  • Fine-grained control: You can define requirements like architecture (e.g., AMD64), capacity type (Spot), and even taint nodes for specific workloads.
  • Resource limits: Set limits on CPU and memory usage to control the overall size and resource consumption of the NodePool.
  • Disruption policy: Configure consolidation strategies for idle nodes and set expiration times to manage node lifecycles and maintain node health.
  • Prioritization: Assign weights to NodePools to influence which pool the scheduler selects for new pods, ensuring critical workloads get priority access.

Imagine NodePools as your construction foremen: They orchestrate the creation and management of your worker nodes based on your blueprints (NodeClasses) and ensure your cluster has the right resources at the right time, all while keeping costs in check.

Example: Cost-Optimized Spot Pool:

Here’s an example of a Nodepool configuration showcasing diverse settings:

Name: cost-optimized-spot-pool

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: cost-optimized-spot-pool
spec:
template:
metadata:
# Labels are arbitrary key-values that are applied to all nodes in the pool
labels:
cost-optimized: "true"
spec:
# References the Cloud Provider's NodeClass resource, see cloud provider specific documentation
nodeClassRef:
name: default
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
# - key: "karpenter.k8s.aws/instance-types"
# operator: In
# values: ["m5a.large", "c5a.4xlarge", "t3.medium"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r", "t"]
# Provisioned nodes will have these taints
# Taints may prevent pods from scheduling if they are not tolerated by the pod.
taints:
- key: deployment
effect: NoSchedule
value: cost-optimized-spot-pool
disruption:
# If using `WhenEmpty`, Karpenter will only consider nodes for consolidation that contain no workload pods
consolidationPolicy: WhenEmpty
# The amount of time Karpenter should wait after discovering a consolidation decision
# This value can currently only be set when the consolidationPolicy is 'WhenEmpty'
# You can choose to disable consolidation entirely by setting the string value 'Never' here
consolidateAfter: 2m
# The amount of time a Node can live on the cluster before being removed
# Avoiding long-running Nodes helps to reduce security vulnerabilities as well as to reduce the chance of issues that can plague Nodes with long uptimes such as file fragmentation or memory leaks from system processes
# You can choose to disable expiration entirely by setting the string value 'Never' here
expireAfter: 48h
# Resource limits constrain the total size of the cluster.
# Limits prevent Karpenter from creating new instances once the limit is exceeded.
limits:
cpu: "500"
memory: 500Gi
# Priority given to the NodePool when the scheduler considers which NodePool
# to select. Higher weights indicate higher priority when comparing NodePools.
# Specifying no weight is equivalent to specifying a weight of 0.
weight: 2

NodeClasses:

Karpenter NodeClasses are like blueprints for your AWS worker nodes, defining their configuration and behavior:

  • Cloud-provider specific: They are specific to your cloud provider, like EC2NodeClasses for AWS.
  • Define details: They specify aspects like AMI family (OS), security groups, subnets, and IAM roles.
  • Multiple per cluster: You can have several NodeClasses for different node types within your cluster.
  • Referenced by NodePools: NodePools reference NodeClasses to determine the specific configuration of nodes they create.

Think of them as recipes for your nodes, ensuring they have the right ingredients (configuration) to fit their role in the cluster.

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2 # Amazon Linux 2
role: "karpenter-demo-workers" # IAM role to use for the node identity & for Karpenter to launch nodes
subnetSelectorTerms:
- tags:
Name: "karpenter-demo-subnet-private*" # subnets to attach to instances
securityGroupSelectorTerms:
- tags:
Name: "eks-cluster-sg-karpenter-demo" # security group that has to be attached to nodes
amiSelectorTerms:
- id: ami-0937d28aa1608c597 # AMI to be used for the instances

Example Deployment File with Node Selector:

Here’s an simple example of a Deployment that targets the cost-optimized-spot-pool:

apiVersion: apps/v1
kind: Deployment
metadata:
name: cost-optimized-app
spec:
replicas: 1
selector:
matchLabels:
app: karpenter-demo
template:
metadata:
labels:
app: karpenter-demo
spec:
nodeSelector: # <-- NodeSelector specified here
cost-optimized: "true" # Matches the label in the NodePool template
tolerations: # <-- Tolerations added here
- key: "deployment"
operator: "Equal"
value: "cost-optimized-spot-pool"
effect: "NoSchedule"
containers:
- name: karpenter-demo
image: nginx:latest
resources:
requests:
cpu: "10m"
memory: "1Gi"

Explanation:

Here’s how Karpenter will react to your deployment file in its current configuration:

1. Deployment Analysis:

  • Karpenter analyzes the specified pod requirements: 1 replica, nginx:latest image, 10m CPU and 1Gi memory requests.
  • It observes the nodeSelector with the cost-optimized: true label, indicating desired placement on nodes with that label.
  • It notices the tolerations for the “deployment” taint with the value “cost-optimized-spot-pool”, allowing scheduling on nodes with that taint.

2. Node Pool Check:

  • Karpenter searches for a matching NodePool named cost-optimized-spot-pool (as inferred from the taint value).
  • It verifies if the NodePool configuration aligns with the deployment’s requirements:NodeClass compatibility (architecture, resources, etc.)Sufficient available capacity within the NodePool limits.

3. Provisioning or Selection:

  • If a suitable node exists in the cost-optimized-spot-pool with no conflicting pods, Karpenter schedules the pod on that node.
  • If no matching node exists, Karpenter provisions a new one based on the NodePool’s default EC2NodeClass configuration:Launching a Spot Instance from the specified AMI and instance category.Attaching it to the appropriate subnet and security group.Applying the “deployment” taint with the “cost-optimized-spot-pool” value.Joining the node to the Kubernetes cluster.
  • Once a suitable node is available, Karpenter schedules the pod onto it.

4. Pod Lifecycle Management:

  • Karpenter continues monitoring the pod and the NodePool.
  • If the pod terminates and no other pods require the node, Karpenter might decide to terminate the Spot Instance based on the NodePool’s consolidation policy and expiration time.
  • If the deployment scales up (replicas > 1), Karpenter repeats the process of finding or provisioning suitable nodes within the cost-optimized-spot-pool.

Key Points:

  • Your nodeSelector and tolerations guide Karpenter to utilize the cost-optimized Spot Instances managed by the cost-optimized-spot-pool.
  • Karpenter dynamically provisions or selects nodes based on pod requirements and NodePool configuration.
  • This enables efficient, cost-optimized deployments leveraging Spot Instances while adhering to your placement preferences.

Additional Notes:

  • You can adjust the nodeSelector and tolerations for granular control over pod placement.
  • Consider scaling up your NodePool’s limits if you anticipate higher deployment demands.
  • Monitor the Karpenter controller logs and metrics for insights into provisioning and management activities.

Here’s an excerpt of Karpenter logs demonstrating instance provisioning:

{"level":"INFO","time":"2024-02-12T09:02:15.512Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"17d6c05","pods":"default/cost-optimized-app-85d8cb5846-79wqm","duration":"26.2188ms"}
{"level":"INFO","time":"2024-02-12T09:02:15.512Z","logger":"controller.provisioner","message":"computed new nodeclaim(s) to fit pod(s)","commit":"17d6c05","nodeclaims":1,"pods":1}
{"level":"INFO","time":"2024-02-12T09:02:15.949Z","logger":"controller.provisioner","message":"created nodeclaim","commit":"17d6c05","nodepool":"cost-optimized-spot-pool","nodeclaim":"cost-optimized-spot-pool-hrqcz","requests":{"cpu":"220m","memory":"1264Mi","pods":"6"},"instance-types":"c3.large, c3.xlarge, c4.large, c4.xlarge, c5.large and 95 other(s)"}
{"level":"INFO","time":"2024-02-12T09:02:19.878Z","logger":"controller.nodeclaim.lifecycle","message":"launched nodeclaim","commit":"17d6c05","nodeclaim":"cost-optimized-spot-pool-hrqcz","provider-id":"aws:///us-east-1c/i-0df7422a964ab2837","instance-type":"t3.small","zone":"us-east-1c","capacity-type":"spot","allocatable":{"cpu":"1930m","ephemeral-storage":"17Gi","memory":"1418Mi","pods":"11"}}
{"level":"INFO","time":"2024-02-12T09:03:10.264Z","logger":"controller.nodeclaim.lifecycle","message":"initialized nodeclaim","commit":"17d6c05","nodeclaim":"cost-optimized-spot-pool-hrqcz","provider-id":"aws:///us-east-1c/i-0df7422a964ab2837","node":"ip-172-31-xx-xx.ec2.internal"}

Key Points from the Karpenter Logs:

Pod Discovery:

  • The controller identifies a provisionable pod named cost-optimized-app-85d8cb5846–79wqm.

NodeClaim Creation:

  • Karpenter computes a new NodeClaim named cost-optimized-spot-pool-hrqcz to accommodate the pod’s requirements.

Node Launching:

  • The controller launches the NodeClaim, resulting in an AWS Spot Instance:
  • Instance ID: aws:///us-east-1c/i-0df7422a964ab2837
  • Instance type: t3.small (despite listing other options, Karpenter chose this based on availability and resource fit)
  • Zone: us-east-1c
  • Capacity type: spot (confirming our cost-optimized approach)
  • Allocatable resources: CPU, memory, pods (indicating capacity for additional pods)

Node Initialization:

  • Karpenter initializes the node, assigning it the internal IP address ip-172–31-xx-xx.ec2.internal.
  • This node is now ready to schedule your cost-optimized-app pod.

When the workload is deleted,

{"level":"INFO","time":"2024-02-12T09:06:28.316Z","logger":"controller.disruption","message":"disrupting via emptiness delete, terminating 1 candidates ip-172-31-xx-xx.ec2.internal/t3.small/spot","commit":"17d6c05","command-id":"94ecd020-1545-4e0a-8d59-2c4413c9dfe1"}         
{"level":"INFO","time":"2024-02-12T09:06:29.141Z","logger":"controller.disruption.queue","message":"command succeeded","commit":"17d6c05","command-id":"94ecd020-1545-4e0a-8d59-2c4413c9dfe1"}
{"level":"INFO","time":"2024-02-12T09:06:29.201Z","logger":"controller.node.termination","message":"tainted node","commit":"17d6c05","node":"ip-172-31-xx-xx.ec2.internal"}
{"level":"INFO","time":"2024-02-12T09:06:29.617Z","logger":"controller.node.termination","message":"deleted node","commit":"17d6c05","node": "ip-172-31-xx-xx.ec2.internal"}
{"level":"INFO","time":"2024-02-12T09:06:29.962Z","logger":"controller.nodeclaim.termination","message":"deleted nodeclaim","commit":"17d6c05","nodeclaim":"cost-optimized-spot-pool-hrqcz","node":"ip-172-31-xx-xx.ec2.internal","provider-id":"aws:///us-east-1c/i-0df7422a964ab2837 "}

Here is the explanation of above logs: Empty Workload Detection:

  • The first log (level:INFO, time:2024–02–12T09:06:28.316Z) mentions “disrupting via emptiness delete,” indicating Karpenter identified an empty node (no pods scheduled) eligible for termination.
  • This node is identified as ip-172–31-xx-xx.ec2.internal/t3.small/spot, confirming it’s a Spot Instance within your cost-optimized pool.

Termination and Cleanup:

  • The second log (level:INFO, time:2024–02–12T09:06:29.141Z) shows the termination command succeeding, confirming the Spot Instance deletion process initiated.
  • The third log (level:INFO, time:2024–02–12T09:06:29.201Z) mentions “tainted node,”
  • The fourth log (level:INFO, time:2024–02–12T09:06:29.617Z) confirms the actual deletion of the Spot Instance (ip-172–31-xx-xx.ec2.internal).

NodeClaim Deletion:

  • The fifth log (level:INFO, time:2024–02–12T09:06:29.962Z) indicates the deletion of the associated NodeClaim (cost-optimized-spot-pool-hrqcz).

Overall:

These logs demonstrate Karpenter’s efficient management of Spot Instances based on workload demands. When a node becomes empty, Karpenter gracefully terminates the Spot Instance and cleans up associated resources like the NodeClaim, optimizing costs and resource utilization.

The above example is just a glimpse of the possibilities with NodePools & NodeClass. You can tailor settings to your specific needs, creating dedicated environments for:

  • High-performance computing: Define a pool with powerful instances like c5.xlarge for resource-intensive tasks.
  • Development and testing: Utilize a cost-effective pool with t3a instances for isolated testing environments.
  • Batch processing: Create a pool with specific instance types and labels for optimized batch processing jobs.

By harnessing the power of NodePool & NodeClass, you transform your EKS cluster from a monolithic entity into a vibrant city with diverse districts, each catering to specific needs and residents. You gain unparalleled control over resource allocation, security, and cost, ultimately empowering you to build and manage your EKS clusters with unmatched efficiency and flexibility.

We’ve just embarked on our journey into the fascinating world of NodePools and NodeClasses, experiencing the power they bring to managing your EKS cluster’s resources. But our exploration doesn’t end there! Buckle up, because in our next post, we’ll delve deeper into the advanced features Karpenter offers to further enhance your experience:

  • Workload Consolidation: Discover how Karpenter intelligently evicts workloads from underutilized nodes, reclaiming resources and minimizing costs.
  • Spot Interruption Handler: Learn how Karpenter gracefully handles interruptions when using Spot Instances, ensuring your workloads experience minimal disruption.

Useful links:

--

--