Karpenter: AWS Flying Magic Carpet

Žygimantas Čijunskis
4 min readSep 15, 2023

--

What is Karpenter?

Karpenter, made by AWS, is an enhanced Kubernetes Cluster Autoscaler that automatically provisions additional EC2 instances in response to insufficient capacity.

While it currently lacks support for multiple cloud vendors, we can foresee it becoming a crucial tool for companies trying to achieve cloud-agnostic solutions, utilizing its flexibility to enhance both reliability and cost-effectiveness.

How does Karpenter operate?

  1. HPA triggers an additional pod replica after CPU utilisation of a pod exceeds specific threshold.
  2. Pod enters the Pending state and fails to schedule because insufficient resources. Kubernetes scheduler sets pod status to Unschedulable=True.
  3. Karpenter finds an unschedulable pod event in kubernetes events.
  4. Karpenter evaluates pod scheduling constraints (resource requests, nodeselector, node affinities etc.).
  5. Karpenter Provisioner creates an additional node that meet requirements of a unscheduled pod.
  6. Pod become scheduled on a new additional node.
  7. Karpenter Deprovisioner removes the node after it becomes empty and TTL exceeds.

Why Karpenter?

While Karpenter shares similarities with Cluster Autoscaler, its main distinction lies in deploying autoscaled EC2 instances externally from the Node Group itself. This allows us to have a more precise control of instance preferences which Karpenter will use when creating a new node. Here are the key example preferences we can configure:

EC2 Compute related settings:

  • Instance Category / CPU / Hypervisor / Generation / Architecture (amd64, arm64) (Cost savings by utilizing specialized hardware. Performance optimization for specific workloads.)
  • Capacity type (Reduced costs with spot ec2 instances for non-critical workloads. Enhanced reliability with on-demand ec2 instances for critical workloads.)
  • Taints / startupTaints (Stable environment by preventing pod scheduling on unready nodes. Controlled deployment and scaling of applications.)
  • Availability Zones / Topology Spreads (Increased availability and resilience against node and AZ failures. Improved workload distribution for better performance.)
  • AMIs / Custom Block Devices / UserData (Consistent operating system environments. Simplified management and troubleshooting of operating systems.)

EC2 Security related settings:

  • SecurityGroups (Controlled network traffic for pods accessing external resources.)
  • Instance Profiles (Secure access to specific AWS services without exposing credentials. Simplified management of IAM roles for instances.)
  • Subnets (Network isolation for improved security. Efficient resource allocation and management.)

So, in short, with Karpenter-provided functionalities, we can effectively utilize only the necessary resources, all while sustaining stability, security, and flexibility.

Karpenter main concepts:

Provisioners: Controls and manages the creation of nodes with specified constraints, startup taints, tolerations, consolidations, weights and priorities.
Node Templates : AWS related settings that are helping Provisioners to configure a Node.
Scheduler: Considers a pod creation preferences and selects the best fit of a node with specific type of a processor / availability-zones.
Deprovisioner: Efficiently removes empty, expired, over provisioned or interrupted (spot) nodes using TTL.
Settings: You can configure Karpenter with env variables, CLI flags, ConfigMap settings and feature gates.
Control Pod Density: — Helps to limit how many pods can be deployed on the same node. Main considiration was AWS ENI limitation.
Metrics: Karpenter provides handful of metrics in Prometheus format f.e controller runtime, consistency, deprovisioning, interruption, node and other metrics.
Threat models: Karpenter also outlines potential security threats and mitigation issues.

Karpenter vs Cluster Autoscaler — which one is faster?

Though we are lacking for official benchmarks, Reddit users generally agreed that both Karpenter and Cluster Autoscaler took almost the same amount of time when creating nodes and scheduling pods.

Can you use Karpenter alongside Managed Node Groups?

Karpenter doesn’t actually scale out nodes within the Managed Node Group. It scales out the nodes using individual EC2 Instances.

Kube scheduler will schedule pods to existing nodes in a managed node group if they exist. But when scaling the cluster, it will spin up new nodes that are not part of any Autoscaling Group or Managed Node Group.

Why Karpenter doesn’t work as the default scheduler?

Karpenter is trying to stay-in-line with kube-scheduler and stay away from non-standard things. It would likely be a more intrusive approach compared to working solely with kube-scheduler. Therefore, Karpenter is unlikely to pursue that path unless substantial benefits are evident. Source.

In the next Article we will be looking into how we can Deploy Karpenter on Existing/New EKS cluster.

Thank you! 🙏

--

--