Sitemap
Inside Doctrine

Articles from the team behind Doctrine

Follow publication

Upgrading EKS to 1.25

--

TL; DR: We just completed our Kubernetes Upgrade Campaign on EKS to 1.25. If you want to reproduce it at home (or at work…), here are a few pieces of information that might be useful to you.

🎉 We do have an open position for a Devops/SRE Engineer ; if you would like to become the new engineer upgrading to EKS 1.26, please apply for the job !

⏰ As a kind reminder, you cannot upgrade more than one minor version on Kubernetes (even if you’re running on EKS!). So this article assumes that you’re already running on EKS 1.24.

Pre-requisites BEFORE the upgrade

Read the following prerequisites carefully, as no rollback is available once an EKS upgrade has been completed.

1️⃣ AWS LoadBalancer

It’s required to upgrade AWS LoadBalancer Controler en 2.4.7 BEFORE starting the control plane upgrade to EKS 1.25. The AWS EKS documentation states it here.

Update the IAM policy attached to the LoadBalancer’s IAM Role
Upgrading the AWS LoadBalancer Controler also requires an update on its attached IAM role, as the Controler now operates by tagging some resources (AWS LoadBalancers).

The Updated policy statement is available in the documentation. The changes only contain additions to the previous version. You can update the policy without risking any side-effects.

Update the AWS LoadBalancer’s helm chart
You may want to consider updating the helm chart deploying AWS LoadBalancer. The latest version of the chart available for now still points to the AWS LoadBalancer Controler 2.4.6, so please update the AppValue value on your helm deployment.

2️⃣ Deprecations on Kubernetes 1.25

The exhaustive change list from 1.24 to 1.25 is available here.

Pay special attention to the “Urgent Upgrades Notes” that contain the main evolutions susceptible to affect your workloads. Whatever is your Kubernetes upgrade, this is always a good idea.

Here are the main changes that might affect you :

⚠️ PodDisruptionBudget and CronJob are not served by policy API v1beta1 anymore,

You must migrate the API invoked in your manifests to apiVersion: policy/v1 for your PodDisruptionBudgets and CronJob, as these resources are not offered by the policy/v1beta1 anymore.

How to find out the affected PodDisruptionBudget and CronJob that requires an upgrade?
Unfortunately, there is no way to know which endpoint (deprecated or not) has been used to create Kubernetes resources due to how the Kubernetes API creates the objects.

Even if an object has been created with a Beta API endpoint, it will always be listed with the latest stable API endpoint (v1) while retrieved. More explanations are available here, and here is the explanation

So, you must look among all your Kubernetes manifests and your helm charts (custom or others) and ensure that the deprecated endpoint (v1/beta1) is not used anymore.

Specific case for Terraform-managed Kubernetes resources
The Terraform Resource kubernetes_pod_disruption_budget uses the deprecated endpoint. You need to first migrate the resources in your state to kubernetes_pod_disruption_budget_v1.
This is the same process for kubernetes_cron_jobs that needs to be migrated to kubernetes_cron_jobs_v1

And there is no way to consider a Terraform State Move as the objects differ. You have to remove the legacy version from the state before re-importing them.

The Terraform import is not documented on the Provider documentation; however, it still works as follows:

terraform import module.commons.kubernetes_pod_disruption_budget_v1.datadog-kube-state-metrics kube-system/datadog-kube-state-metrics

⚠️ Mandatory PodSecurityPolicy deprecation

PodSecurityPolicies support is ended towards Pod Security Associations. You must not have any PodSecurityPolicy remaining in your EKS cluster before starting the EKS 1.25 upgrade.

How to find out any remaining PodSecurityPolicy in my cluster?
You may list the existing PodSecurityPolicies on your EKS Cluster with :

kubectl get psp --all-namespaces

Your kubectl client should also output a friendly deprecation warning.

Please note that an existing, default PodSecurityPolicy will remain (`eks.privilegied`). You can leave it as it is, it will simply disappear with the EKS 1.25 upgrade.

You can find a FAQ related to the PodSecurityPolicy deprecation on the Amazon EKS documentation.

❗️1.23 ≥ EKS addon Kube-Proxy < 1.25

You cannot have more than two minor versions difference between Kube-Proxy and the Kubernetes Control plane. This is written in the EKS documentation.

The Kube-Proxy EKS addon MUST be deployed in the 1.23 or 1.24 version before deploying the update. And… don’t try to cheat; you are not allowed to migrate to an upper version of the Kube-Proxy than the EKS control plane.

➡ Other EKS Addons

There are no hard requirements to upgrade the EKS addons before updating the Control Plane to 1.25, although this is a piece of good advice to update these add-ons often.

Please note that the VPC-CNI EKS Addon cannot be upgraded more than one minor version at a time (the same way as the Kubernetes Control Plane).

The EKS Upgrade to 1.25

🎳 Upgrade the EKS Control Plane to 1.25

Example of live nodes migration

We have upgraded the EKS Control Plane via our Terraform stack (we don’t apply any kind of change on the AWS Console). The entire upgrade went well, the nodes were progressively upgraded to a new AMI (new nodes are spawned, older nodes are tainted NotReady, SchedulingDisabled and pods are gently evicted to the new nodes).

The EKS control plane version is 1.25.6.

Once you’ve completed the EKS Control Plane upgrade, there is still some remaining work to achieve.

🚀 Upgrade the Kubernetes Cluster Autoscaler to 1.25

It’s highly recommended to follow the same cluster autoscaler version than the minor Control Plane version. So we have to migrate the Cluster Autoscaler to 1.25.

Here are the Cluster Autoscaler release notes for 1.25. Among them, a few cool things :

  • Now support the latest EC2 7th generation, Graviton-based
  • Now support EC2 6th generation, intel based

Upgrading the Cluster AutoScaler
Pretty straightforward, update the helm chart and force the appVersion to 1.25 if required. Deploy, that’s it.

When the EKS Cluster AutoScaler boots up, ensure the leader election succeeds.

➡️ Upgrading the EKS Cluster Addons

The two EKS add-ons can be updated simultaneously, each upgrade should not take more than a few seconds to complete.

Please note that you should NEVER try to upgrade the EKS Cluster Addons along with the Control Plane, but separately.

CoreDNS 1.9

CoreDNS is now offered in 1.9 version by AWS EKS. The main change is the end-of-support for DNS wildcards requests — they are not served any more and will return an empty result.

This might affect you if you’re using an auto-discovery service using such DNS request.

Kube-Proxy 1.25

It’s recommended — but not mandatory — to upgrade Kube-Proxy to the same version as the Kubernetes cluster (1.25).

There is no AWS-specific specific information available now about this new version.

💰 Upgrade your instance group to the latest generation

The latest upgrade you’ve made should also be an opportunity to upgrade the instances nodes running on your cluster.
Here is an example of saving money and getting better performances on the same architecture (intel x86_64).

Switching to ARM (Graviton-based) processors can have a higher impact. However, you must carefully ensure that all your workloads operate well on a CPU architecture change.

Wrap up

This upgrade went well on all our clusters. We are good to go for the next 14 months!

The upgrade procedure on the EKS managed part is always a pleasure: it is reliable and stable…

… as long as you’ve met all the upgrades prerequisites …

… and with this article, you got covered!

--

--

Inside Doctrine
Inside Doctrine
Ben Riou
Ben Riou

Written by Ben Riou

Ben is a Senior Devops in Doctrine’s infrastructure team. In his free time, he enjoys hiking the scenic trails of Île-de-France, especially the GR1 route.

Responses (2)