Kubernetes Highly Available Cluster Upgrade

Published in

Platformer — A WSO2 Company

4 min readJul 7, 2020

Depending on the method used to set up the cluster, the upgrade method will change. If a managed cluster is used, cluster upgrades will also be taken care of by the service provider (GCP, AWS, Azure etc.). In this scenario, Kubeadm tool was used to set up the cluster, hence; Kubeadm will be used to upgrade the cluster as well.

Note this

Kubeadm should upgrade from one major version to another. You cannot skip another version and upgrade to it. 1.16 to 1.17, 1.17 to 1.1.8. Assume that our current version is 1.17 and we are upgrading to the next major version which is 1.18
Upgrade procedure should be followed one node at a time on a production cluster. Do not upgrade all nodes at onces.

Currently, all the Kubernetes nodes are running v1.16.11 and the final desired version is v1.18.x. Before upgrading a certain node. It is required to ensure that no pods are running on it as the node will be temporarily down during the upgrade. Therefore, it is recommended to drain the node so that it will evict the pods from that node and transfer to a different running node.

Upgrading Kubeadm Master Nodes

Run all the commands using root user

First we need to find the stable major version of the next Kubernetes version we are going to upgrade.
Update the package list and find the suitable version using these commands.

apt update
apt-cache madison kubeadm

On each master and worker nodes update kubeadm

apt-mark unhold kubeadm  && \apt-get update && apt-get install -y kubeadm=1.18.5-00 && \apt-mark hold kubeadm

Verify that the download works and has the expected version

kubeadm version

Drain the first Master Node (To Evict the workloads)

# replace <cp-node-name> with the name of your control plane nodekubectl drain <cp-node-name> --ignore-daemonsets

On the first master node run:

You should see an output like this

This command checks that your cluster can be upgraded, and fetches the versions you can upgrade to.

# replace x with the patch version you picked for this upgradesudo kubeadm upgrade apply v1.18.x

After running the above command, you should get a message like this.

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.18.0". Enjoy![upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

Uncordon the control plane node:

# replace <cp-node-name> with the name of your control plane nodekubectl uncordon <cp-node-name>

Upgrade additional control plane nodes

Drain the next Master Node

# replace <cp-node-name> with the name of your control plane nodekubectl drain <cp-node-name> --ignore-daemonsets

Instead of running kubeadm upgrade apply, run this command.

kubeadm upgrade node

Uncordon the control plane node:

# replace <cp-node-name> with the name of your control plane nodekubectl uncordon <cp-node-name>

Do this for the other control plane nodes.

Upgrade kubelet and kubectl on all master nodes

# replace x in 1.18.x-00 with the latest patch versionapt-mark unhold kubelet kubectl && \apt-get update && apt-get install -y kubelet=1.18.x-00 kubectl=1.18.x-00 && \apt-mark hold kubelet kubectl

Restart the kubelet

sudo systemctl daemon-reloadsudo systemctl restart kubelet

Manually upgrade your CNI provider plugin.

Your preferred CNI should be updated manually. Change the below link according to the version you are planning to upgrade

https://v1-17.docs.kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

V1–18 to the next major version.

Find your CNI and apply the related config file on any of the master nodes (If the CNI is not running as a daemonset, you need to run this on all master nodes)

Upgrade worker nodes

Do not upgrade the worker nodes parallel. Upgrade one node at a time.
Kubeadm was upgraded on all nodes of the cluster on the first step, If not upgrade kubeadm on all other worker nodes.

# replace x in 1.18.x-00 with the latest patch versionapt-mark unhold kubeadm && \apt-get update && apt-get install -y kubeadm=1.18.x-00 && \apt-mark hold kubeadm

Drain the node

Prepare the node for maintenance by marking it unschedulable and evicting the workloads:

# replace <node-to-drain> with the name of your node you are drainingkubectl drain <node-to-drain> --ignore-daemonsets

You should see output similar to this:

WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-proxy-dj7d7, kube-system/weave-net-z65qx

Upgrade the kubelet configuration

kubeadm upgrade node

Upgrade kubelet on all worker nodes

apt-mark unhold kubelet && \apt-get update && apt-get install -y kubelet=1.18.x-00 && \apt-mark hold kubelet

Uncordon the node

Bring the node back online by marking it schedulable

# replace <node-to-drain> with the name of your nodekubectl uncordon <node-to-drain>

Verify the status of the cluster

kubectl get nodes