Scaling Simplified: Integrating Karpenter into Your EKS Cluster

Tanmay Varade
6 min readNov 3, 2024

--

Karpenter is a modern open-source Kubernetes cluster autoscaler designed to optimize the management of containerized applications. Developed by AWS, Karpenter automatically provisions and scales nodes in response to the real-time needs of your workloads. Unlike traditional scaling solutions, Karpenter makes intelligent decisions about the type and size of instances to deploy based on current demand, ensuring that your applications have the resources they need to perform optimally.

In the rapidly evolving landscape of cloud-native applications, efficient resource management is crucial. Karpenter provides a streamlined approach to autoscaling that allows developers and operations teams to focus on building and maintaining applications rather than managing infrastructure. With its ability to rapidly adjust resources, Karpenter helps ensure that applications remain responsive and cost-effective, no matter how demand fluctuates.

Advantages Over Traditional Autoscaler

  1. Dynamic Provisioning: Karpenter can automatically create new nodes based on real-time workload requirements, rather than relying on predefined rules. This leads to better resource utilization.
  2. Faster Scaling: Karpenter responds quickly to changes in demand, allowing it to add or remove nodes in minutes, which is essential for workloads that experience sudden spikes in traffic.
  3. Cost Efficiency: By intelligently selecting the appropriate instance types and sizes, Karpenter reduces over-provisioning and can effectively utilize spot instances, ultimately saving costs.
  4. Simplified Management: With Karpenter, you don’t need to manage multiple node groups manually. It handles node provisioning based on the specific requirements of your workloads.
  5. Better Support for Diverse Workloads: Karpenter is designed to understand the characteristics of your applications, allowing it to make informed decisions that better match your workload profiles.

Step-by-Step Guide to Adding Karpenter in EKS

Now that you understand what Karpenter is and why it’s beneficial, let’s go through the steps to integrate Karpenter into your Amazon EKS cluster.

Prepare Your EKS Cluster

Ensure you have a functioning EKS cluster. If you need to create one, you can do so via the AWS Management Console or AWS CLI.

We Need below variable for terraform modules

variable "KARPENTER_NAMESPACE" {
type = string
default = "karpenter"
}

variable "CLUSTER_NAME" {
type = string
default = "DemoCluster"
}

variable "AWS_REGION" {
type = string
default = "us-east-1"
}

variable "AWS_ACCOUNT_ID" {
type = string
default = "456125790758"
}
variable "AWS_AMI_ID" {
type = string
default = "ami-999f24d6ec63084c5"
}

variable "KARPENTER_VERSION" {
type = string
default = "1.0.7"
}

Create Necessary IAM Roles for nodes and attached required permission

Karpenter requires permissions to manage AWS resources. Follow these steps to set up the required IAM roles:

Create IAM Role for Karpenter nodes

resource "aws_iam_role" "KarpenterNodeRole" {
name = "KarpenterNodeRole-${var.CLUSTER_NAME}"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Sid = ""
Principal = {
Service = "ec2.amazonaws.com"
}
},
]
})

tags = {
cluster = var.CLUSTER_NAME
}
}
resource "aws_iam_role_policy_attachment" "AmazonEKSWorkerNodePolicy" {
role = aws_iam_role.KarpenterNodeRole.name
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
}
resource "aws_iam_role_policy_attachment" "AmazonEKS_CNI_Policy" {
role = aws_iam_role.KarpenterNodeRole.name
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
}
resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" {
role = aws_iam_role.KarpenterNodeRole.name
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}
resource "aws_iam_role_policy_attachment" "AmazonSSMManagedInstanceCore" {
role = aws_iam_role.KarpenterNodeRole.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

Create IAM Role for Karpenter Server

Now we need to create an IAM role that the Karpenter controller will use to provision new instances. The controller will be using IAM Roles for Service Accounts (IRSA) which requires an OIDC endpoint.

resource "aws_iam_policy" "KarpenterControllerRole" {
name = "KarpenterControllerRole"
description = "IAM policy for Karpenter to manage EC2 instances and IAM roles"

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "Karpenter"
Effect = "Allow"
Action = [
"ssm:GetParameter",
"ec2:DescribeImages",
"ec2:RunInstances",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:DescribeLaunchTemplates",
"ec2:DescribeInstances",
"ec2:DescribeInstanceTypes",
"ec2:DescribeInstanceTypeOfferings",
"ec2:DeleteLaunchTemplate",
"ec2:CreateTags",
"ec2:CreateLaunchTemplate",
"ec2:CreateFleet",
"ec2:DescribeSpotPriceHistory",
"pricing:GetProducts"
]
Resource = "*"
},
{
Sid = "ConditionalEC2Termination"
Effect = "Allow"
Action = "ec2:TerminateInstances"
Resource = "*"
Condition = {
StringLike = {
"ec2:ResourceTag/karpenter.sh/nodepool" = "*"
}
}
},
{
Sid = "PassNodeIAMRole"
Effect = "Allow"
Action = "iam:PassRole"
Resource = "arn:aws:iam::${var.AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${var.CLUSTER_NAME}"
},
{
Sid = "EKSClusterEndpointLookup"
Effect = "Allow"
Action = "eks:DescribeCluster"
Resource = "arn:aws:eks:${var.AWS_REGION}:${var.AWS_ACCOUNT_ID}:cluster/${var.CLUSTER_NAME}"
},
{
Sid = "AllowScopedInstanceProfileCreationActions"
Effect = "Allow"
Action = ["iam:CreateInstanceProfile"]
Resource = "*"
Condition = {
StringEquals = {
"aws:RequestTag/kubernetes.io/cluster/${var.CLUSTER_NAME}" = "owned"
"aws:RequestTag/topology.kubernetes.io/region" = "${var.AWS_REGION}"
}
StringLike = {
"aws:RequestTag/karpenter.k8s.aws/ec2nodeclass" = "*"
}
}
},
{
Sid = "AllowScopedInstanceProfileTagActions"
Effect = "Allow"
Action = ["iam:TagInstanceProfile"]
Resource = "*"
Condition = {
StringEquals = {
"aws:ResourceTag/kubernetes.io/cluster/${var.CLUSTER_NAME}" = "owned"
"aws:ResourceTag/topology.kubernetes.io/region" = "${var.AWS_REGION}"
"aws:RequestTag/kubernetes.io/cluster/${var.CLUSTER_NAME}" = "owned"
"aws:RequestTag/topology.kubernetes.io/region" = "${var.AWS_REGION}"
}
StringLike = {
"aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass" = "*"
"aws:RequestTag/karpenter.k8s.aws/ec2nodeclass" = "*"
}
}
},
{
Sid = "AllowScopedInstanceProfileActions"
Effect = "Allow"
Action = [
"iam:AddRoleToInstanceProfile",
"iam:RemoveRoleFromInstanceProfile",
"iam:DeleteInstanceProfile"
]
Resource = "*"
Condition = {
StringEquals = {
"aws:ResourceTag/kubernetes.io/cluster/${var.CLUSTER_NAME}" = "owned"
"aws:ResourceTag/topology.kubernetes.io/region" = "${var.AWS_REGION}"
}
StringLike = {
"aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass" = "*"
}
}
},
{
Sid = "AllowInstanceProfileReadActions"
Effect = "Allow"
Action = "iam:GetInstanceProfile"
Resource = "*"
}
]
})
}


resource "aws_iam_role" "KarpenterControllerRole" {
name = "KarpenterControllerRole-${var.CLUSTER_NAME}"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Federated = "arn:aws:iam::${var.AWS_ACCOUNT_ID}:oidc-provider/${replace(data.aws_eks_cluster.cluster.identity[0].oidc[0].issuer, "https://", "")}"
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringEquals = {
"${replace(data.aws_eks_cluster.cluster.identity[0].oidc[0].issuer, "https://", "")}:aud" = "sts.amazonaws.com"
"${replace(data.aws_eks_cluster.cluster.identity[0].oidc[0].issuer, "https://", "")}:sub" = "system:serviceaccount:${var.KARPENTER_NAMESPACE}:karpenter"
}
}
}
]
})
}

resource "aws_iam_role_policy_attachment" "karpenter_policy_attachment" {
role = aws_iam_role.KarpenterControllerRole.name
policy_arn = aws_iam_policy.KarpenterControllerRole.arn
}

resource "aws_iam_role_policy_attachment" "AWSBudgetsReadOnlyAccess" {
role = aws_iam_role.KarpenterControllerRole.name
policy_arn = "arn:aws:iam::aws:policy/AWSBudgetsReadOnlyAccess"
}

Add tags to security group, subnet and ami for Karpenter to discover them.

Karpenter uses tags on subnets to discover where it can provision new nodes. By tagging your subnets with the appropriate key-value pairs (e.g., karpenter.sh/discovery), Karpenter can automatically identify which subnets are eligible for node provisioning. This allows it to make informed decisions about where to launch instances based on resource availability and workload requirements.

This works same with EC2 Security Group and AMI’s


resource "aws_ec2_tag" "karpenter_subnet_tag" {
for_each = toset(data.aws_eks_cluster.cluster.vpc_config[0].subnet_ids)
resource_id = each.value
key = "karpenter.sh/discovery"
value = var.CLUSTER_NAME
}

resource "aws_ec2_tag" "karpenter_sg_tag" {
for_each = toset(data.aws_eks_cluster.cluster.vpc_config[0].security_group_ids)
resource_id = each.value
key = "karpenter.sh/discovery"
value = var.CLUSTER_NAME
}

Add Access Entry to EKS cluster for Karpenter Node Role

Creating an access entry for the node role in Kubernetes is important because it grants the necessary permissions for the nodes to interact with AWS services. This includes actions like pulling container images, accessing secrets, and making API calls to other AWS resources. By defining these permissions, you ensure that your nodes can function effectively within your EKS cluster, enabling smooth deployment and management of workloads while maintaining security and operational integrity.

resource "aws_eks_access_entry" "KarpenterNodeRole" {
cluster_name = var.CLUSTER_NAME
principal_arn = aws_iam_role.KarpenterNodeRole.arn
type = "EC2_LINUX"
}

Create Service Linked Role for Spot

Creating a Service Linked Role (SLR) for Spot Instances is crucial when using Karpenter, as it allows seamless management of AWS resources. This role grants Karpenter the necessary permissions to request, launch, and manage Spot Instances automatically, facilitating cost-effective scaling in response to fluctuating workloads. By simplifying permissions management and adhering to security best practices, SLRs reduce the risk of misconfigurations and ensure that Karpenter can operate reliably. Ultimately, having a dedicated SLR enhances the efficiency of Spot Instance utilization while maximizing cost savings for your applications.

resource "aws_iam_service_linked_role" "spot_service_role" {
aws_service_name = "spot.amazonaws.com"

lifecycle {
ignore_changes = [
aws_service_name,
]
}
}

Install Helm Chart

resource "helm_release" "karpenter" {
name = "karpenter"
repository = "oci://public.ecr.aws/karpenter/karpenter"
chart = "karpenter"
version = var.KARPENTER_VERSION

set {
name = "settings.clusterName"
value = var.CLUSTER_NAME
}

set {
name = "serviceAccount.annotations.eks.amazonaws.com/role-arn"
value = aws_iam_role.KarpenterControllerRole.arn
}

set {
name = "controller.resources.requests.cpu"
value = "1"
}
set {
name = "controller.resources.requests.memory"
value = "1Gi"
}
}

Create Sample NodePool and EC2 Node Class for Karpenter

You will find yaml file in GitHub repository.

data "kubectl_path_documents" "karpenter_objects" {
pattern = "./templates/karpenter_objects.yaml.tpl"
vars = {
KarpenterNodeRole = "test" #aws_iam_role.KarpenterNodeRole.name
CLUSTER_NAME = var.CLUSTER_NAME
AWS_AMI_ID = var.AWS_AMI_ID
KARPENTER_NAMESPACE = var.KARPENTER_NAMESPACE
}
}

resource "kubectl_manifest" "karpenter_objects" {
count = length(data.kubectl_path_documents.karpenter_objects.documents)
yaml_body = element(data.kubectl_path_documents.karpenter_objects.documents, count.index)
}

For a complete example of setting up Karpenter in your EKS cluster, including IAM roles, security group configurations, and provisioning settings, you can refer to this GitHub repository.

You can customize node specifications for your application by utilizing labels and taints within the NodePool configuration. For instance, in your NodePool template, you can define metadata labels to specify the characteristics of the nodes, like this:

template:
metadata:
labels:
tier: cluster-level

In addition to labels, you can apply taints to control which pods can be scheduled on these nodes. For example, you might set a taint that prevents non-matching pods from being scheduled on nodes designated for frontend applications:

taints:
- key: tier
value: frontend
effect: NoSchedule

By leveraging labels and taints in your NodePool, you can ensure that specific applications are deployed on the appropriate nodes, optimizing resource allocation and application performance.

Conclusion

Karpenter provides a powerful alternative to the legacy Cluster Autoscaler, offering more flexibility, speed, and cost efficiency for managing your Kubernetes workloads. With this guide, you should be well on your way to integrating Karpenter into your EKS cluster.

Feel free to ask if you have any questions or need further clarification!

--

--

Tanmay Varade
Tanmay Varade

Written by Tanmay Varade

Site Reliability Engineer, Avrioc Technologies

No responses yet