Azure Kubernetes Service — why you should take care of your nodes

Nico Meisenzahl
Nov 12, 2019 · 3 min read

Azure Kubernetes Service is a fully managed Kubernetes Cluster provided by Azure. This means that you don’t have to care about anything related to the Kubernetes infrastructure and just care about your apps deployed on it. Unfortunately, that is not entirely true with regard to your worker nodes as mentioned in the documentation:

To protect your clusters, security updates are automatically applied to Linux nodes in AKS. These updates include OS security fixes or kernel updates. Some of these updates require a node reboot to complete the process. AKS doesn’t automatically reboot these Linux nodes to complete the update process.

As mentioned above, Azure will automatically install all required updates and security patches on its own, but you have to decide when to restart your nodes if necessary. Of course, this is something that needs to be automated to make sure all of your worker nodes are secure and up-to-date. The below guide does not support Windows nodes. Also, this is only needed for the workers and not for the master nodes.

Every Linux distribution will create a file called /var/run/reboot-require as soon as a patch requires a reboot. This means we can use the file as an indicator to find nodes that require a reboot. In order to do this as well as to initiate the restart, the easiest way is to use a open-source project called kured (KUbernetes REboot Daemon) by Weaveworks.

Kured utilizes a DaemonSet which then schedules a Pod on every existing worker nodes to verify whether the reboot-require file exists. The DaemonSet ensures that all nodes are verified, including newly created ones. As soon as kured finds nodes which need to be restarted it will schedule a restart based on your definitions:

  • start and end hours (only from 2–5 am)
  • defined days (only on weekends)
  • prevent reboots based on labeled pods
  • skip on active Prometheus alerts

Kured also exposes metrics that can be captured by Prometheus or can send Slack notifications via a Slack hook.

How to start?

First of all, you need to create a Service Account and some Rules and Bindings to provide the needed privileges (you can skip this part if you have not activated RBAC, which you shouldn't):

Now you can create the DeamonSet:

Kured will automatically restart my nodes based on the above example on Saturdays and Sundays between 2–5 am when needed. Kured also ensures that the entire workload is shifted to other nodes within the cluster before a worker node gets restarted.

01001101

Stories related to DevOps topics by Nico Meisenzahl. 01001101? First char of my surname.

Nico Meisenzahl

Written by

Senior Cloud & DevOps Consultant at white duck. Docker Community Leader, GitLab Hero, blogger & speaker. 👨‍💻🙋‍♂️ Loves Kubernetes, DevOps & Cloud.

01001101

01001101

Stories related to DevOps topics by Nico Meisenzahl. 01001101? First char of my surname.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade