eks upgrade: 1.25 to 1.26 .. to 1.27

John Zen
4 min readJun 24, 2024

--

My upgrading from 1.25 to 1.26 is similar to upgrading from 1.24 to 1.25.

As mentioned in eks upgrade: 1.24 to 1.25, my eks cluster is

Further changes made:

  • use access entry for RBAC.
  • apply PodSecurityAdmission(PSA) to some initial namespace and karpenter namespace.
  • create ec2nodeclass and nodepool using kubernetes_manifest .

Check deprecated or remove APIs using kubent and eks-insights

Run kubent , found not issue.

Run

aws eks list-insights --region ap-southeast-2 --cluster-name my-cluster

aws eks describe-insight --region ap-southeast-2 --id ${insights-entry-id} --cluster-name my-cluster

Highlight a HPA is using outdated APIs. Upon examine, it has already being updated to latest API version. i.e. false detection.

Steps

Detect changes in code, statefile or resources

Run terraform plan without change to detect changes and make sure code, statefile and resources are in-sync.

Update terraform and providers version

terraform {
required_version = ">= 1.6.2"

required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.58.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.31.0"
}
helm = {
source = "hashicorp/helm"
version = "2.14.0"
}
}
}

I use terraform 1.6.2 as this is last common version for terraform and opentofu. I can decide which way to go without code change.

Update terraform module version

module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.19.0"
...
}
module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.16.3"
...
}

Update cluster addons version

Update cluster addons version to latest while at eks 1.25. When eks is upgrade, higher versions may be available for update.

I encountered issue to update vpc-cni addon to its latest version of 1.18.2-eksbuild.1. Set resolve_conflicts = “OVERWRITE” resolve the problem.

Update addons

Upgrade addons:

    addons = {
metrics_server = {
chart_version = "3.12.1"
}
aws_load_balancer_controller = {
chart_version = "1.8.1"
}

karpenter = {
chart_version = "0.37.0"
}
karpenter_node_additional_policies = {}
}

Access entry

Add access enrty for key roles:

  • build role
  • super-admin role
  • node role
access_entries = {
build-iac = {
principal_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/build-iac"

policy_associations = {
this = {
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
access_scope = {
type = "cluster"
}
}
}
}
super-admin = {
principal_arn = tolist(data.aws_iam_roles.superadmin.arns)[0]

policy_associations = {
this = {
policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
access_scope = {
type = "cluster"
}
}
}
}
managed-node-role = {
principal_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/eks-node"
type = "EC2_LINUX"
}
}

I import aws_eks_access_entry and aws_eks_access_policy_association resources already created. e.g.

import {
to = module.eks.aws_eks_access_entry.this["build-iac"]
id = "eks-gitlabtest:arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/build-iac"
}

import {
to = module.eks.aws_eks_access_policy_association.this["build-iac_this"]
id = "eks-gitlabtest#arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/build-iac#arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy"
}

I then remove from aws-auth these entries.

PodSecurityAdmission (PSA)

I apply PSA to some of the initial namespaces and karpenter namespace.

    kubernetes_resource_labels = {
# karpenter namespace
"namespace_karpenter" = {
api_version = "v1"
kind = "Namespace"
name = "karpenter"
labels = {
"pod-security.kubernetes.io/enforce" = "baseline"
"pod-security.kubernetes.io/enforce-version" = "v1.29"
"pod-security.kubernetes.io/audit" = "baseline"
"pod-security.kubernetes.io/audit-version" = "v1.29"
"pod-security.kubernetes.io/warn" = "baseline"
"pod-security.kubernetes.io/warn-version" = "v1.29"
}
}
# label initial namespaces
# exempt:
# - kube-system
# - kube-node-lease
"namespace_default" = {
api_version = "v1"
kind = "Namespace"
name = "default"
labels = {
"pod-security.kubernetes.io/enforce" = "baseline"
"pod-security.kubernetes.io/enforce-version" = "v1.29"
"pod-security.kubernetes.io/audit" = "baseline"
"pod-security.kubernetes.io/audit-version" = "v1.29"
"pod-security.kubernetes.io/warn" = "baseline"
"pod-security.kubernetes.io/warn-version" = "v1.29"
}
}
"namespace_kube-public" = {
api_version = "v1"
kind = "Namespace"
name = "kube-public"
labels = {
"pod-security.kubernetes.io/enforce" = "baseline"
"pod-security.kubernetes.io/enforce-version" = "v1.29"
"pod-security.kubernetes.io/audit" = "baseline"
"pod-security.kubernetes.io/audit-version" = "v1.29"
"pod-security.kubernetes.io/warn" = "baseline"
"pod-security.kubernetes.io/warn-version" = "v1.29"
}
}
}
}

Upgrade the cluster

This takes looong time…

Update cluster addons version

After upgraded version, some cluster addons has additional higher versions to update to.

Final updated cluster addons:

cluster_addons = {
kube-proxy = {
addon_version = "v1.27.12-eksbuild.5"
resolve_conflicts = "OVERWRITE"
}
coredns = {
addon_version = "v1.10.1-eksbuild.11"
resolve_conflicts = "OVERWRITE"
}
aws-ebs-csi-driver = {
addon_version = "v1.32.0-eksbuild.1"
resolve_conflicts = "OVERWRITE"
service_account_role_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/AmazonEKS_EBS_CSI_DriverRole"
}
snapshot-controller = {
addon_version = "v8.0.0-eksbuild.1"
resolve_conflicts = "OVERWRITE"
}
vpc-cni = {
addon_version = "v1.18.2-eksbuild.1"
preserve = true
# terraform not happy with PRESERVE
resolve_conflicts = "NONE"
service_account_role_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/AmazonEKSVPCCNIRole"
configuration_values = jsonencode({
env = {
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG = "true"
ENI_CONFIG_LABEL_DEF = "failure-domain.beta.kubernetes.io/zone"
}
})
}
}

Verify

  • Check karpenter by scaling up pod.
  • Check rbac by access console and run kubectl.
  • Check pod ip address vs node ip address for secondary cidrs.
  • Check load balancer controller

--

--