Deploying Karpenter on EKS Using CDKTF

Stevosjt
20 min readDec 31, 2023

--

This article is to demonstrate how to add Karpenter to an EKS cluster in AWS. This is the second article on setting up EKS. If you have not been through how to setup EKS using CDKTF, please review and follow along on this article here. https://medium.com/@stevosjt88/creating-an-eks-cluster-using-cdktf-ed6cf28599c9. I will be adding Karpenter into this cluster but the code here can apply Karpenter to any cluster. There are some assumptions here that you are a bit familiar with EKS and how IRSA (IAM Role for Service Account) works.

What is Karpenter?

Karpenter is a Kubernetes node autoscaler that automatically adjusts the size of a Kubernetes cluster based on resource requirements. It helps optimize the allocation of resources in a Kubernetes environment by dynamically scaling the number of nodes in the cluster.

You can find more information on the Karpenter website to keep up to date. https://karpenter.sh

Getting Started

We will be using our exiting infrastructure that was built in previous articles but as I mentioned before, this can be applied to existing clusters.

To start, lets create our karpenter.ts file. We will also need to add a few more providers to support it and this is where it gets a little more tricky. We want to add the @cdktf/provider-kubernetes and the @cdktf/provider-helm packages but we also want to add a provider that does not have an NPM package. So first lets install the npm packages.

npm install @cdktf/provider-kubernetes
npm install @cdktf/provider-helm

Then we need to update our cdktf.json file in our root directory to include the alekc/kubectl provider. The reason why we want this kubectl package is that the terraform kubernetes provider cannot add custom resources without the custom resource already in place for the plan phase. So resource creation will fail unless you use this other package. Your cdktf.json file will look like this when you are done.

{
"language": "typescript",
"app": "npx ts-node main.ts",
"projectId": "33664a48-9c40-435b-88f9-db9116274600",
"sendCrashReports": "false",
"terraformProviders": [
"alekc/kubectl@~>2.0.3"
],
"terraformModules": [],
"context": {

}
}

We can then run the command “cdktf get” in the terminal. This will download and create the kubectl provider for you in a folder named .gen. We will then need to initialize both the kubernetes and the kubectl provider as well as some other ones. Lets go ahead and create the following files:

  1. addons.ts
  2. addons folder
  3. karpenter.ts in the addons folder.

We should have a file structure like this now.

I really like this structure as we can centralize all provider referenced in the addons.ts file and then just add a file in the addons folder for each additional addon that we want to have present in the cluster. We now can start initializing our providers in the addons.ts as well as have it call our karpenter.ts addon.

import { AwsProvider } from "@cdktf/provider-aws/lib/provider";
import { Creds } from "../main";
import { Karpenter } from "./addons/karpenter";
import { Eks } from "./eks";
import { Network } from "./network";
import { Fn, TerraformStack } from "cdktf";
import { DataAwsEksCluster } from "@cdktf/provider-aws/lib/data-aws-eks-cluster";
import { KubernetesProvider } from "@cdktf/provider-kubernetes/lib/provider";
import { DataAwsEksClusterAuth } from "@cdktf/provider-aws/lib/data-aws-eks-cluster-auth";
import { KubectlProvider } from "../.gen/providers/kubectl/provider";
import { HelmProvider } from "@cdktf/provider-helm/lib/provider";
import { RandomProvider } from "@cdktf/provider-random/lib/provider";
import { NullProvider } from "@cdktf/provider-null/lib/provider";
import { EksAddon } from "@cdktf/provider-aws/lib/eks-addon";

export class Addons extends TerraformStack {

constructor(scope: any, creds: Creds, network: Network, eks: Eks) {
super(scope, 'addons');

// Load providers
new AwsProvider(this, 'aws', creds);

const clusterData = new DataAwsEksCluster(this, 'cluster-data', {
name: eks.cluster.name,
});

const clusterAuth = new DataAwsEksClusterAuth(this, 'cluster-auth', {
name: eks.cluster.name,
});

new KubernetesProvider(this, 'kubernetes', {
host: clusterData.endpoint,
clusterCaCertificate: Fn.base64decode(clusterData.certificateAuthority.get(0).data),
token: clusterAuth.token,
});

new KubectlProvider(this, 'kubectl', {
applyRetryCount: 5,
host: clusterData.endpoint,
clusterCaCertificate: Fn.base64decode(clusterData.certificateAuthority.get(0).data),
token: clusterAuth.token,
loadConfigFile: false,
});

new HelmProvider(this, 'helm', {
kubernetes: {
host: clusterData.endpoint,
clusterCaCertificate: Fn.base64decode(clusterData.certificateAuthority.get(0).data),
token: clusterAuth.token,
},
});

new RandomProvider(this, 'random', {});
new NullProvider(this, 'null', {});

// Install VPC CNI addon
new EksAddon(this, 'karpenter-addon-vpc-cni', {
clusterName: eks.cluster.name,
addonName: 'vpc-cni',
});

// Install Kube Proxy addon
new EksAddon(this, 'karpenter-addon-kube-proxy', {
clusterName: eks.cluster.name,
addonName: 'kube-proxy',
});

// Install CoreDNS addon
new EksAddon(this, 'karpenter-addon-codedns', {
clusterName: eks.cluster.name,
addonName: 'coredns',
});

// Install Karpenter
const karpenter = new Karpenter(this, network, eks);

// Create the aws-auth configmap
const iamRoles = Fn.yamlencode([
{
rolearn: karpenter.nodeRole.arn,
username: 'system:node:{{EC2PrivateDNSName}}',
groups: ['system:bootstrappers', 'system:nodes'],
},
{
rolearn: karpenter.fpRole.arn,
username: 'system:node:{{SessionName}}',
groups: [
'system:bootstrappers',
'system:nodes',
'system:node-proxier',
],
},
]);

new ConfigMapV1(scope, 'eks-iam-role-permissions', {
metadata: {
name: 'aws-auth',
namespace: 'kube-system',
},
data: {
mapRoles: iamRoles,
},
});
}
}

In the above code, we are initializing a few providers to use when building our addons.

  1. Kubernetes Provider — Used to build kubernetes resources like deployments, service accounts, services, etc.
  2. Kubectl Provider — Used to build customer resources inside Kubernetes like the EC2NodeClass and NodePool resources.
  3. Helm Provider — Can deploy helm charts against the EKS cluster.
  4. Random Provider — Used to generate random strings. Required for the Null resource provider.
  5. Null Provider — Allows you to execute code locally like validation or waiting tasks.

We also are installing some default EKS addons.

  1. VPC CNI — Manages pod networking
  2. Kube Proxy — A network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.
  3. CodeDNS — Manages the dns inside the cluster

These are all essential to running an EKS cluster and it is good to add them in so they can be managed by terraform.

We are also creating the aws-auth configmap. This should be created and stored in state so that you can add to it later if needed. It allows IAM Roles to interact with EKS resources from the cluster level and required that node roles and fargate profile roles be registered here to work properly.

Now we can update our main.ts to call the addons.ts file so that it will be built and setup the proper dependencies. Here is an example or our updated main.ts file.

import { App } from "cdktf";
import { Network } from "./src/network";
import { Eks } from "./src/eks";
import { Addons } from "./src/addons";

export interface Creds {
accessKey: string;
secretKey: string;
region: string;
}

const creds: Creds = {
accessKey: '',
secretKey: '',
region: 'us-east-1',
}

const app = new App();
const network = new Network(app, creds);
const eks = new Eks(app, creds, network);
const addons = new Addons(app, creds, network, eks);

eks.addDependency(network);
addons.addDependency(eks);

app.synth();

We are now ready to start building our Karpenter addon. We will start by building just the base structure in the karpenter.ts file.

export class Karpenter {

public readonly nodeRole: IamRole;
public readonly fpRole: IamRole;

constructor(scope: TerraformStack, network: Network, cluster: Eks) {

}
}

Notice we are just passing in the scope here to this file. Referencing the same scope as the addons will ensure that these resources are all stored in the same terraform state file and can use the same providers that were initialized in the addons.ts file.

We can now start building the resources that are needed to support running a container in Fargate. We will start with the IAM Role.

const assumeRolePolicy = new DataAwsIamPolicyDocument(
this,
`eks-fargate-assumeRolePolicy`,
{
statement: [
{
effect: 'Allow',
actions: ['sts:AssumeRole'],
principals: [
{
identifiers: [
'eks-fargate-pods.amazonaws.com',
],
type: 'Service',
},
],
},
],
}
);

this.fpRole = new IamRole(this, `eks-karpenter-fp`, {
name: `eks-karpenter-fp-role`,
assumeRolePolicy: assumeRolePolicy.json,
});

new IamRolePolicyAttachment(this, `eks-master-policy-attachment-AmazonEKSFargatePodExecutionRolePolicy`, {
role: this.fpRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy`,
});

We then need to build the namespace where the Karpenter pods will live.

const namespace = new NamespaceV1(this, 'karpenter-namespace', {
metadata: {
name: 'karpenter',
labels: {
name: 'karpenter',
},
},
});

We will then create the Fargate Profile that Karpenter will use. Note that Karpenter is required to run on nodes that it does not manage. So you either need to add a base autoscaling group with a couple of ec2 instances to run Karpenter or you can build them using the Fargate. I chose Fargate because there is less management overhead to run the pods there.

const profile = new EksFargateProfile(this, 'karpenter-fp', {
clusterName: cluster.cluster.name,
fargateProfileName: 'karpenter-fp',
podExecutionRoleArn: this.fpRole.arn,
subnetIds: network.privateSubnetIds,
selector: [
{
namespace: namespace.metadata.name,
},
],
});

Note that the Fargate profile is tied to a selector “namespace”. That means that any pod that is launched in the namespace “karpenter” will default to run in Fargate.

We now need to create the specific resources to support running Karpenter. We need to start by creating a role and security group that will be attached to the nodes that Karpenter will launch and ensure it has the proper aws permissions as well as an IAM Instance Profile.

const assumeRolePolicyNodes = new DataAwsIamPolicyDocument(
this,
`eks-node-assumeRolePolicy`,
{
statement: [
{
effect: 'Allow',
actions: ['sts:AssumeRole'],
principals: [
{
identifiers: [
'eks-fargate-pods.amazonaws.com',
],
type: 'Service',
},
],
},
],
}
);

this.nodeRole = new IamRole(this, `eks-karpenter-node-role`, {
name: `eks-karpenter-node-role`,
assumeRolePolicy: assumeRolePolicy.json,
});

new IamRolePolicyAttachment(this, `eks-master-policy-attachment-AmazonEKSWorkerNodePolicy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy`,
});

new IamRolePolicyAttachment(this, `eks-master-policy-attachment-AmazonEKS_CNI_Policy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy`,
});

new IamRolePolicyAttachment(this, `eks-master-policy-attachment-AmazonEC2ContainerRegistryReadOnly`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly`,
});

new IamRolePolicyAttachment(this, `eks-master-policy-attachment-CloudWatchAgentServerPolicy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy`,
});

new IamInstanceProfile(this, 'node-iam-instance-profile', {
name: `dev-eks-node-role`,
role: this.nodeRole.name,
});

const nodeSg = new SecurityGroup(this, `eks-node-security-group`, {
name: `dev-eks-node-security-group`,
vpcId: network.vpcId,
description: `eks node security group`,
ingress: [
{
fromPort: 0,
toPort: 0,
protocol: '-1',
selfAttribute: true,
},
{
fromPort: 0,
toPort: 0,
protocol: '-1',
securityGroups: [cluster.securityGroup.id]
}
],
egress: [
{
fromPort: 0,
toPort: 0,
protocol: '-1',
cidrBlocks: ['0.0.0.0/0'],
},
],
tags: {
Name: `dev-eks-node-security-group`,
},
lifecycle: {
ignoreChanges: ['ingress', 'egress'],
},
});

Next we will build the node termination handling resources. Karpenter now handles node termination events for spot instances. This means that if you are currently running the node termination handler helm chart, Karpenter can replace it. So we will need to build the SQS queue and the event rules to handle this.

const queue = new SqsQueue(
this,
'karpenter-sqs-queue',
{
name: `${cluster.cluster.name}-karpenter`,
messageRetentionSeconds: 300,
},
);

const eventRules = [
{
name: 'SpotTermRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Spot Instance Interruption Warning'],
}),
},
{
name: 'RebalanceRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Instance Rebalance Recommendation'],
}),
},
{
name: 'InstanceStateChangeRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Instance State-change Notification'],
}),
},
{
name: 'ScheduledChangeRule',
eventPattern: Fn.jsonencode({
source: ['aws.health'],
'detail-type': ['AWS Health Event'],
}),
},
];

eventRules.forEach((e) => {
const eventRule = new CloudwatchEventRule(
this,
`${e.name}-er`,
{
name: `${cluster.cluster.name}-${e.name}`,
eventPattern: e.eventPattern,
},
);
new CloudwatchEventTarget(this, `${e.name}-et`, {
rule: eventRule.id,
arn: queue.arn,
});
});

In the code above, if one of the events is triggered, it sends a message to the SQS queue that we created. We will pass the SQS queue arn to karpenter so it knows what queue poll from.

Next we need to create the permissions for Karpenter to access AWS resources so it can do it’s job.

const account = new DataAwsCallerIdentity(this, 'account', {});
const karpenterPolicy = new DataAwsIamPolicyDocument(this, 'karpenter-policy', {
statement: [
{
sid: 'AllowInterruptionQueueActions',
effect: 'Allow',
actions: ['sqs:DeleteMessage', 'sqs:GetQueueUrl', 'sqs:GetQueueAttributes', 'sqs:ReceiveMessage'],
resources: [queue.arn],
},
{
sid: 'AllowScopedEC2InstanceActions',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1::image/*`,
`arn:aws:ec2:us-east-1::snapshot/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
`arn:aws:ec2:us-east-1:*:security-group/*`,
`arn:aws:ec2:us-east-1:*:subnet/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
],
actions: ['ec2:RunInstances', 'ec2:CreateFleet'],
},
{
sid: 'AllowScopedEC2InstanceActionsWithTags',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:fleet/*`,
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:volume/*`,
`arn:aws:ec2:us-east-1:*:network-interface/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
],
actions: ['ec2:RunInstances', 'ec2:CreateFleet', 'ec2:CreateLaunchTemplate'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowScopedResourceCreationTagging',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:fleet/*`,
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:volume/*`,
`arn:aws:ec2:us-east-1:*:network-interface/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
],
actions: ['ec2:CreateTags'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'ec2:CreateAction',
values: ['RunInstances', 'CreateFleet', 'CreateLaunchTemplate'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowScopedResourceTagging',
effect: 'Allow',
resources: [`arn:aws:ec2:us-east-1:*:instance/*`],
actions: ['ec2:CreateTags'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.sh/nodepool',
values: ['*'],
},
{
test: 'ForAllValues:StringEquals',
variable: 'aws:TagKeys',
values: ['karpenter.sh/nodeclaim', 'Name'],
},
],
},
{
sid: 'AllowScopedDeletion',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
],
actions: ['ec2:TerminateInstances', 'ec2:DeleteLaunchTemplate'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowRegionalReadActions',
effect: 'Allow',
resources: ['*'],
actions: [
'ec2:DescribeAvailabilityZones',
'ec2:DescribeImages',
'ec2:DescribeInstances',
'ec2:DescribeInstanceTypeOfferings',
'ec2:DescribeInstanceTypes',
'ec2:DescribeLaunchTemplates',
'ec2:DescribeSecurityGroups',
'ec2:DescribeSpotPriceHistory',
'ec2:DescribeSubnets',
],
condition: [
{
test: 'StringEquals',
variable: 'aws:RequestedRegion',
values: ['us-east-1'],
},
],
},
{
sid: 'AllowSSMReadActions',
effect: 'Allow',
resources: [`arn:aws:ssm:us-east-1::parameter/aws/service/*`],
actions: ['ssm:GetParameter'],
},
{
sid: `AllowPricingReadActions`,
effect: 'Allow',
resources: ['*'],
actions: ['pricing:GetProducts'],
},
{
sid: 'AllowPassingInstanceRole',
effect: 'Allow',
resources: [this.nodeRole.arn],
actions: ['iam:PassRole'],
condition: [
{
test: 'StringEquals',
variable: 'iam:PassedToService',
values: ['ec2.amazonaws.com'],
},
],
},
{
sid: 'AllowScopedInstanceProfileCreationActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:CreateInstanceProfile'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:RequestTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowScopedInstanceProfileTagActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:TagInstanceProfile'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:ResourceTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:RequestTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowScopedInstanceProfileActions',
effect: 'Allow',
resources: ['*'],
actions: [
'iam:AddRoleToInstanceProfile',
'iam:RemoveRoleFromInstanceProfile',
'iam:DeleteInstanceProfile',
],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:ResourceTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowInstanceProfileReadActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:GetInstanceProfile'],
},
{
sid: 'AllowAPIServerEndpointDiscovery',
effect: 'Allow',
resources: [
`arn:aws:eks:us-east-1:${account.accountId}:cluster/${cluster.cluster.name}`,
],
actions: ['eks:DescribeCluster'],
},
]
});

There is a lot to take in there for the permissions. These are the recommendations with least privilege access for Karpenter. We can now create the IAM role for Karpenter and the service account that we can bind the IAM role to.

const eksOidcIssuerUrl = cluster.oidcProvider.url;
const eksOidcProviderArn = `arn:aws:iam::${account.accountId}:oidc-provider/${eksOidcIssuerUrl}`;

const assumeRolePolicyKarpenter = new DataAwsIamPolicyDocument(
this,
`karpenter-arp`,
{
statement: [
{
actions: ['sts:AssumeRoleWithWebIdentity'],
principals: [
{
type: 'Federated',
identifiers: [eksOidcProviderArn],
},
],
condition: [
{
test: 'StringEquals',
variable: `${eksOidcIssuerUrl}:sub`,
values: [
`system:serviceaccount:${namespace.metadata.name}:karpenter`,
],
},
],
},
],
},
);

const karpenterRole = new IamRole(this, `eks-karpenter-role`, {
name: `dev-eks-karpenter-role`,
assumeRolePolicy: assumeRolePolicyKarpenter.json,
tags: {
Name: `dev-eks-karpenter-role`,
},
});

new IamRolePolicy(this, 'karpenter-policy-attachment', {
role: karpenterRole.name,
policy: karpenterPolicy.json,
name: 'karpenter-policy',
});

const serviceAccount = new ServiceAccountV1(
this,
`karpenter-service-account`,
{
metadata: {
name: 'karpenter',
namespace: namespace.metadata.name,
annotations: {
'eks.amazonaws.com/role-arn': karpenterRole.arn,
},
},
},
);

Once we have this created, I have a preferrence of also hosting the aws-auth configmap in state as well. This is normally auto-generated once a node joins the cluster but then it is not managed in state and cannot be updated via state unless imported at a later time. So I like to build it here to ensure I have control over that resource.

const iamRoles = Fn.yamlencode([
{
rolearn: nodeRole.arn,
username: 'system:node:{{EC2PrivateDNSName}}',
groups: ['system:bootstrappers', 'system:nodes'],
},
{
rolearn: fpRole.arn,
username: 'system:node:{{SessionName}}',
groups: [
'system:bootstrappers',
'system:nodes',
'system:node-proxier',
],
},
]);

const awsAuthConfigMap = new ConfigMapV1(this, 'eks-iam-role-permissions', {
metadata: {
name: 'aws-auth',
namespace: 'kube-system',
},
data: {
mapRoles: iamRoles,
},
});

We are now finally ready to build Karpenter using the helm chart. We will also be installing the Karpenter kubernetes custom resource definitions via helm chart so that they are managed in state as well.

const version = 'v0.33.1';
const repository = 'oci://public.ecr.aws/karpenter';

const crdRelease = new Release(this, 'karpenter-crd', {
namespace: namespace.metadata.name,
createNamespace: false,
name: `${serviceAccount.metadata.name}-crd`,
repository,
chart: 'karpenter-crd',
description: 'Karpenter CRD Helm Chart',
version,
});

const release = new Release(this, 'karpenter', {
namespace: namespace.metadata.name,
createNamespace: false,
name: serviceAccount.metadata.name,
repository,
chart: 'karpenter',
description: 'Karpenter Helm Chart',
version,
set: [
{
name: 'serviceAccount.create',
value: 'false',
},
{
name: 'serviceAccount.name',
value: serviceAccount.metadata.name,
},
{
name: 'settings.clusterName',
value: cluster.cluster.name,
},
{
name: 'settings.clusterEndpoint',
value: cluster.cluster.endpoint,
},
{
name: 'settings.interruptionQueue',
value: queue.name,
},
],
dependsOn: [profile, awsAuthConfigMap, crdRelease],
timeout: 300,
});

const wait = new Resource(this, 'wait-for-karpenter', {
dependsOn: [release],
});
wait.addOverride('provisioner', [
{
'local-exec': [
{
command: 'sleep 90',
interpreter: ['/bin/sh', '-c'],
},
],
},
]);

Notice that we are adding some dependsOn resources. We just want to ensure that these other resources run before the Karpenter helm chart gets applied to ensure the proper dependencies get installed first. Also, we created a “wait” resource to wait 90 seconds. This is to ensure Karpenter is running before we try to apply other resources. Now that we have these installed, we can now install a few of the resources needed for karpenter to launch nodes. We will need to build two resources. An Ec2NodeClass and a NodePool. This is just an example configuration but should be adjusted to fit your needs.

const nodeClass = new Manifest(this, `node-class-linux`, {
yamlBody: Fn.yamlencode({
apiVersion: 'karpenter.k8s.aws/v1beta1',
kind: 'EC2NodeClass',
metadata: {
name: `linux`,
},
spec: {
amiFamily: 'AL2',
subnetSelectorTerms: network.privateSubnetIds.map((subnetId) => ({
id: subnetId,
})),
securityGroupSelectorTerms: [
{
id: nodeSg.id,
},
],
blockDeviceMappings: [
{
deviceName: '/dev/xvda',
ebs: {
volumeSize: '50Gi',
volumeType: 'gp3',
encrypted: true,
deleteOnTermination: true,
},
},
],
role: this.nodeRole.name,
tags: {
os: 'linux',
},
detailedMonitoring: false,
},
}),
dependsOn: [wait],
});

new Manifest(this, `node-pool-linux`, {
yamlBody: Fn.yamlencode({
apiVersion: 'karpenter.sh/v1beta1',
kind: 'NodePool',
metadata: {
name: `linux`,
},
spec: {
template: {
spec: {
nodeClassRef: {
name: `linux`,
apiVersion: 'karpenter.k8s.aws/v1beta1',
kind: 'EC2NodeClass',
},
requirements: [
{
key: 'karpenter.sh/capacity-type',
operator: 'In',
values: ['on-demand'],
},
{
key: 'kubernetes.io/os',
operator: 'In',
values: ['linux'],
},
{
key: 'kubernetes.io/arch',
operator: 'In',
values: ['amd64'],
},
{
key: 'node.kubernetes.io/instance-type',
operator: 'In',
values: ['t3.medium', 't3a.medium'],
}
],
},
},
limits: {
cpu: '20',
memory: '100Gi',
},
disruption: {
consolidationPolicy: 'WhenUnderutilized',
expireAfter: '168h',
},
},
}),
dependsOn: [nodeClass],
});

This is where a lot of the power that Karpenter has gets defined. The requirements field can be very specific or as broad as you would like. We can also specify we want spot instances, on-demand instances, or a preference of spot instances and fall back to on-demand instances in order to save cost but maintain uptime in our clusters. I highly encourage you to read the Karpenter documentation to get more familiar with these resources.

So just to see the entire karpenter.ts file

import { Fn, TerraformStack } from "cdktf";
import { Eks } from "../eks";
import { Network } from "../network";
import { EksFargateProfile } from "@cdktf/provider-aws/lib/eks-fargate-profile";
import { DataAwsIamPolicyDocument } from "@cdktf/provider-aws/lib/data-aws-iam-policy-document";
import { IamRole } from "@cdktf/provider-aws/lib/iam-role";
import { IamRolePolicyAttachment } from "@cdktf/provider-aws/lib/iam-role-policy-attachment";
import { NamespaceV1 } from "@cdktf/provider-kubernetes/lib/namespace-v1";
import { SqsQueue } from "@cdktf/provider-aws/lib/sqs-queue";
import { CloudwatchEventRule } from "@cdktf/provider-aws/lib/cloudwatch-event-rule";
import { CloudwatchEventTarget } from "@cdktf/provider-aws/lib/cloudwatch-event-target";
import { DataAwsCallerIdentity } from "@cdktf/provider-aws/lib/data-aws-caller-identity";
import { IamInstanceProfile } from "@cdktf/provider-aws/lib/iam-instance-profile";
import { ServiceAccountV1 } from "@cdktf/provider-kubernetes/lib/service-account-v1";
import { IamRolePolicy } from "@cdktf/provider-aws/lib/iam-role-policy";
import { Release } from "@cdktf/provider-helm/lib/release";
import { Manifest } from '../../.gen/providers/kubectl/manifest';
import { SecurityGroup } from "@cdktf/provider-aws/lib/security-group";
import { Resource } from "@cdktf/provider-null/lib/resource";

export class Karpenter {

public readonly nodeRole: IamRole;
public readonly fpRole: IamRole;

constructor(scope: TerraformStack, network: Network, cluster: Eks) {

// Build Karpenter Fargate Profile role
const assumeRolePolicy = new DataAwsIamPolicyDocument(
scope,
`eks-fargate-assumeRolePolicy`,
{
statement: [
{
effect: 'Allow',
actions: ['sts:AssumeRole'],
principals: [
{
identifiers: [
'eks-fargate-pods.amazonaws.com',
],
type: 'Service',
},
],
},
],
}
);

this.fpRole = new IamRole(scope, `eks-karpenter-fp`, {
name: `eks-karpenter-fp-role`,
assumeRolePolicy: assumeRolePolicy.json,
});

new IamRolePolicyAttachment(scope, `eks-master-policy-attachment-AmazonEKSFargatePodExecutionRolePolicy`, {
role: this.fpRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy`,
});

const namespace = new NamespaceV1(scope, 'karpenter-namespace', {
metadata: {
name: 'karpenter',
labels: {
name: 'karpenter',
},
},
});

// EksFargateProfile
const profile = new EksFargateProfile(scope, 'karpenter-fp', {
clusterName: cluster.cluster.name,
fargateProfileName: 'karpenter-fp',
podExecutionRoleArn: this.fpRole.arn,
subnetIds: network.privateSubnetIds,
selector: [
{
namespace: namespace.metadata.name,
},
],
});

// Node role
const assumeRolePolicyNodes = new DataAwsIamPolicyDocument(
scope,
`eks-node-assumeRolePolicy`,
{
statement: [
{
effect: 'Allow',
actions: ['sts:AssumeRole'],
principals: [
{
identifiers: [
'ec2.amazonaws.com',
],
type: 'Service',
},
],
},
],
}
);

this.nodeRole = new IamRole(scope, `eks-karpenter-node-role`, {
name: `eks-karpenter-node-role`,
assumeRolePolicy: assumeRolePolicyNodes.json,
});

new IamRolePolicyAttachment(scope, `eks-master-policy-attachment-AmazonEKSWorkerNodePolicy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy`,
});

new IamRolePolicyAttachment(scope, `eks-master-policy-attachment-AmazonEKS_CNI_Policy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy`,
});

new IamRolePolicyAttachment(scope, `eks-master-policy-attachment-AmazonEC2ContainerRegistryReadOnly`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly`,
});

new IamRolePolicyAttachment(scope, `eks-master-policy-attachment-CloudWatchAgentServerPolicy`, {
role: this.nodeRole.name,
policyArn: `arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy`,
});

new IamInstanceProfile(scope, 'node-iam-instance-profile', {
name: `dev-eks-node-role`,
role: this.nodeRole.name,
});

const nodeSg = new SecurityGroup(scope, `eks-node-security-group`, {
name: `dev-eks-node-security-group`,
vpcId: network.vpcId,
description: `eks node security group`,
ingress: [
{
fromPort: 0,
toPort: 0,
protocol: '-1',
selfAttribute: true,
},
{
fromPort: 0,
toPort: 0,
protocol: '-1',
securityGroups: [cluster.securityGroup.id]
}
],
egress: [
{
fromPort: 0,
toPort: 0,
protocol: '-1',
cidrBlocks: ['0.0.0.0/0'],
},
],
tags: {
Name: `dev-eks-node-security-group`,
},
lifecycle: {
ignoreChanges: ['ingress', 'egress'],
},
});

// SQS and Events for node termination handler
const queue = new SqsQueue(
scope,
'karpenter-sqs-queue',
{
name: `${cluster.cluster.name}-karpenter`,
messageRetentionSeconds: 300,
},
);

const eventRules = [
{
name: 'SpotTermRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Spot Instance Interruption Warning'],
}),
},
{
name: 'RebalanceRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Instance Rebalance Recommendation'],
}),
},
{
name: 'InstanceStateChangeRule',
eventPattern: Fn.jsonencode({
source: ['aws.ec2'],
'detail-type': ['EC2 Instance State-change Notification'],
}),
},
{
name: 'ScheduledChangeRule',
eventPattern: Fn.jsonencode({
source: ['aws.health'],
'detail-type': ['AWS Health Event'],
}),
},
];

eventRules.forEach((e) => {
const eventRule = new CloudwatchEventRule(
scope,
`${e.name}-er`,
{
name: `${cluster.cluster.name}-${e.name}`,
eventPattern: e.eventPattern,
},
);
new CloudwatchEventTarget(scope, `${e.name}-et`, {
rule: eventRule.id,
arn: queue.arn,
});
});

// IAM permissions for Karpenter
const account = new DataAwsCallerIdentity(scope, 'account', {});
const karpenterPolicy = new DataAwsIamPolicyDocument(scope, 'karpenter-policy', {
statement: [
{
sid: 'AllowInterruptionQueueActions',
effect: 'Allow',
actions: ['sqs:DeleteMessage', 'sqs:GetQueueUrl', 'sqs:GetQueueAttributes', 'sqs:ReceiveMessage'],
resources: [queue.arn],
},
{
sid: 'AllowScopedEC2InstanceActions',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1::image/*`,
`arn:aws:ec2:us-east-1::snapshot/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
`arn:aws:ec2:us-east-1:*:security-group/*`,
`arn:aws:ec2:us-east-1:*:subnet/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
],
actions: ['ec2:RunInstances', 'ec2:CreateFleet'],
},
{
sid: 'AllowScopedEC2InstanceActionsWithTags',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:fleet/*`,
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:volume/*`,
`arn:aws:ec2:us-east-1:*:network-interface/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
],
actions: ['ec2:RunInstances', 'ec2:CreateFleet', 'ec2:CreateLaunchTemplate'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowScopedResourceCreationTagging',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:fleet/*`,
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:volume/*`,
`arn:aws:ec2:us-east-1:*:network-interface/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
`arn:aws:ec2:us-east-1:*:spot-instances-request/*`,
],
actions: ['ec2:CreateTags'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'ec2:CreateAction',
values: ['RunInstances', 'CreateFleet', 'CreateLaunchTemplate'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowScopedResourceTagging',
effect: 'Allow',
resources: [`arn:aws:ec2:us-east-1:*:instance/*`],
actions: ['ec2:CreateTags'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.sh/nodepool',
values: ['*'],
},
{
test: 'ForAllValues:StringEquals',
variable: 'aws:TagKeys',
values: ['karpenter.sh/nodeclaim', 'Name'],
},
],
},
{
sid: 'AllowScopedDeletion',
effect: 'Allow',
resources: [
`arn:aws:ec2:us-east-1:*:instance/*`,
`arn:aws:ec2:us-east-1:*:launch-template/*`,
],
actions: ['ec2:TerminateInstances', 'ec2:DeleteLaunchTemplate'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.sh/nodepool',
values: ['*'],
},
],
},
{
sid: 'AllowRegionalReadActions',
effect: 'Allow',
resources: ['*'],
actions: [
'ec2:DescribeAvailabilityZones',
'ec2:DescribeImages',
'ec2:DescribeInstances',
'ec2:DescribeInstanceTypeOfferings',
'ec2:DescribeInstanceTypes',
'ec2:DescribeLaunchTemplates',
'ec2:DescribeSecurityGroups',
'ec2:DescribeSpotPriceHistory',
'ec2:DescribeSubnets',
],
condition: [
{
test: 'StringEquals',
variable: 'aws:RequestedRegion',
values: ['us-east-1'],
},
],
},
{
sid: 'AllowSSMReadActions',
effect: 'Allow',
resources: [`arn:aws:ssm:us-east-1::parameter/aws/service/*`],
actions: ['ssm:GetParameter'],
},
{
sid: `AllowPricingReadActions`,
effect: 'Allow',
resources: ['*'],
actions: ['pricing:GetProducts'],
},
{
sid: 'AllowPassingInstanceRole',
effect: 'Allow',
resources: [this.nodeRole.arn],
actions: ['iam:PassRole'],
condition: [
{
test: 'StringEquals',
variable: 'iam:PassedToService',
values: ['ec2.amazonaws.com'],
},
],
},
{
sid: 'AllowScopedInstanceProfileCreationActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:CreateInstanceProfile'],
condition: [
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:RequestTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowScopedInstanceProfileTagActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:TagInstanceProfile'],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:ResourceTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringEquals',
variable: `aws:RequestTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:RequestTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
{
test: 'StringLike',
variable: 'aws:RequestTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowScopedInstanceProfileActions',
effect: 'Allow',
resources: ['*'],
actions: [
'iam:AddRoleToInstanceProfile',
'iam:RemoveRoleFromInstanceProfile',
'iam:DeleteInstanceProfile',
],
condition: [
{
test: 'StringEquals',
variable: `aws:ResourceTag/kubernetes.io/cluster/${cluster.cluster.name}`,
values: ['owned'],
},
{
test: 'StringEquals',
variable: 'aws:ResourceTag/topology.kubernetes.io/region',
values: ['us-east-1'],
},
{
test: 'StringLike',
variable: 'aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass',
values: ['*'],
},
],
},
{
sid: 'AllowInstanceProfileReadActions',
effect: 'Allow',
resources: ['*'],
actions: ['iam:GetInstanceProfile'],
},
{
sid: 'AllowAPIServerEndpointDiscovery',
effect: 'Allow',
resources: [
`arn:aws:eks:us-east-1:${account.accountId}:cluster/${cluster.cluster.name}`,
],
actions: ['eks:DescribeCluster'],
},
]
});

// Build Karpenter IAM role and service account. Attach karpernter policy to role.
const eksOidcIssuerUrl = cluster.oidcProvider.url;
const eksOidcProviderArn = `arn:aws:iam::${account.accountId}:oidc-provider/${eksOidcIssuerUrl}`;

const assumeRolePolicyKarpenter = new DataAwsIamPolicyDocument(
scope,
`karpenter-arp`,
{
statement: [
{
actions: ['sts:AssumeRoleWithWebIdentity'],
principals: [
{
type: 'Federated',
identifiers: [eksOidcProviderArn],
},
],
condition: [
{
test: 'StringEquals',
variable: `${eksOidcIssuerUrl}:sub`,
values: [
`system:serviceaccount:${namespace.metadata.name}:karpenter`,
],
},
],
},
],
},
);

const karpenterRole = new IamRole(scope, `eks-karpenter-role`, {
name: `dev-eks-karpenter-role`,
assumeRolePolicy: assumeRolePolicyKarpenter.json,
tags: {
Name: `dev-eks-karpenter-role`,
},
});

new IamRolePolicy(scope, 'karpenter-policy-attachment', {
role: karpenterRole.name,
policy: karpenterPolicy.json,
name: 'karpenter-policy',
});

const serviceAccount = new ServiceAccountV1(
scope,
`karpenter-service-account`,
{
metadata: {
name: 'karpenter',
namespace: namespace.metadata.name,
annotations: {
'eks.amazonaws.com/role-arn': karpenterRole.arn,
},
},
},
);

// Deploy Karpenter and CRD
const version = 'v0.33.1';
const repository = 'oci://public.ecr.aws/karpenter';

const crdRelease = new Release(scope, 'karpenter-crd', {
namespace: namespace.metadata.name,
createNamespace: false,
name: `${serviceAccount.metadata.name}-crd`,
repository,
chart: 'karpenter-crd',
description: 'Karpenter CRD Helm Chart',
version,
});

const release = new Release(scope, 'karpenter', {
namespace: namespace.metadata.name,
createNamespace: false,
name: serviceAccount.metadata.name,
repository,
chart: 'karpenter',
description: 'Karpenter Helm Chart',
version,
set: [
{
name: 'serviceAccount.create',
value: 'false',
},
{
name: 'serviceAccount.name',
value: serviceAccount.metadata.name,
},
{
name: 'settings.clusterName',
value: cluster.cluster.name,
},
{
name: 'settings.clusterEndpoint',
value: cluster.cluster.endpoint,
},
{
name: 'settings.interruptionQueue',
value: queue.name,
},
],
dependsOn: [profile, crdRelease],
timeout: 300,
});

const wait = new Resource(scope, 'wait-for-karpenter', {
dependsOn: [release],
});
wait.addOverride('provisioner', [
{
'local-exec': [
{
command: 'sleep 90',
interpreter: ['/bin/sh', '-c'],
},
],
},
]);

const nodeClass = new Manifest(scope, `node-class-linux`, {
yamlBody: Fn.yamlencode({
apiVersion: 'karpenter.k8s.aws/v1beta1',
kind: 'EC2NodeClass',
metadata: {
name: `linux`,
},
spec: {
amiFamily: 'AL2',
subnetSelectorTerms: network.privateSubnetIds.map((subnetId) => ({
id: subnetId,
})),
securityGroupSelectorTerms: [
{
id: nodeSg.id,
},
],
blockDeviceMappings: [
{
deviceName: '/dev/xvda',
ebs: {
volumeSize: '50Gi',
volumeType: 'gp3',
encrypted: true,
deleteOnTermination: true,
},
},
],
role: this.nodeRole.name,
tags: {
os: 'linux',
},
detailedMonitoring: false,
},
}),
dependsOn: [wait],
});

new Manifest(scope, `node-pool-linux`, {
yamlBody: Fn.yamlencode({
apiVersion: 'karpenter.sh/v1beta1',
kind: 'NodePool',
metadata: {
name: `linux`,
},
spec: {
template: {
spec: {
nodeClassRef: {
name: `linux`,
apiVersion: 'karpenter.k8s.aws/v1beta1',
kind: 'EC2NodeClass',
},
requirements: [
{
key: 'karpenter.sh/capacity-type',
operator: 'In',
values: ['on-demand'],
},
{
key: 'kubernetes.io/os',
operator: 'In',
values: ['linux'],
},
{
key: 'kubernetes.io/arch',
operator: 'In',
values: ['amd64'],
},
{
key: 'node.kubernetes.io/instance-type',
operator: 'In',
values: ['t3.medium', 't3a.medium'],
}
],
},
},
limits: {
cpu: '20',
memory: '100Gi',
},
disruption: {
consolidationPolicy: 'WhenUnderutilized',
expireAfter: '168h',
},
},
}),
dependsOn: [nodeClass],
});

}
}

Now you should be able to run cdktf deploy “*” and wait for it all to spin up. By the end you should be able to see that 2 Fargate nodes were launched and at least 1 EC2 instance was launched to support the existing pods in the cluster.

You can now apply other deployments to the cluster and watch Karpenter launch more nodes to support the additional work loads.

Conclusion

We saw here how to install Karpenter inside an EKS cluster using CDKTF. This is just a base implementation and many methods from this setup should be abstracted to be reusable for additional addons that you might want to add into your cluster. I coded it this way just for demonstration purposes. I hope this was helpful and that you have a better understanding of how to setup and run Karpenter.

Sidenote

If you run into an error in the Karpenter logs that says “The provided credentials do not have permission to create the service-linked role for EC2 Spot Instances.”, you need to run the following against your aws account. “aws iam create-service-linked-role --aws-service-name spot.amazonaws.com”. This only is applicable if you want Karpenter to launch spot instances.

--

--