Automating Software Versioning on Kubernetes

Published in

bigdatarepublic

5 min readNov 24, 2020

The Kubernetes community is growing rapidly with many cool new open source projects popping up all the time. The Kubernetes tag on Github currently matches 15,431 public repositories and the Cloud Native Compute Foundation tracking over 1,200 projects in 2019, whereas in 2016 around 100 projects were tracked.

Open source projects usually have a short development cycle. While this is a good thing, it makes it harder to stay up to date with the latest versions, especially when deploying many projects.
Patching software on a regular basis is important to keep your cluster safe and is often easier than having to patch a bunch of major changes at the same time. Patching can be tedious, especially without any automation. For many common programming languages version management tools are available for library dependencies; such as Maven Release Plugin and Scala Steward.
These tools continuously check all dependencies for new versions and make pull requests with version updates.

For Kubernetes applications there is currently no such out of the box solution.
In this blog post we will discuss a way of making it easier to keep your Kubernetes applications up to date.

Application Deployment Methods

If you work with Kubernetes for a while, there comes a time you get lost in all the yaml files. Tools like Kustomize, Helm and Jsonnet help with managing deployment configuration. However, not all open source projects support deployment for all these tools. This may result in using a combination of tools, or writing the configuration yourself.

At my current company we found ourselves in exactly that situation.
We deploy applications using Helm, Kustomize and Kubernetes Manifests.
We use Ansible to combine these tools into a single deployment pipeline.
Whether this is the right tool for the job is debatable, but it was the logical choice since we provision our own hardware and deploy Kubernetes on it.

Recently the GitOps movement is getting traction, which we also experimented with. I could write a different blog post on the struggles that we faced rewriting our applications in a GitOpsified way (maybe I will do that later). In this blog post we will stick with the Ansible deployment, which does provide reproducibility and maintainability of all projects.
One advantage of Ansible is that all our project versions can be bundled together and used as variables in the deployment pipeline.
Such a version file looks like this:

We will use this version file for the following examples, that contain a couple of popular projects that we want to keep up-to-date.

Automatically checking for new versions

Before we automated patching, we followed these steps:

For each project, browse the releases page for the latest version.
For major releases, skim the release notes and take care of any breaking changes.
Modify the versions on the development/test version file and deploy to the development cluster.
Fix any issues, commit the changes and repeat for acceptance and/or production.

There are many manual steps involved in this process. Every programmer should feel a tingling feeling when doing something similar multiple times; it is time to automate!
Unfortunately I have not found a tool that automates the whole process, which I imagine is very tricky in terms of compatibility with all the different ways of deploying apps.
One simple but efficient tool that takes care of the first step is Nvchecker.
It can check versions for standard repositories, such as: PyPi, npm and apt.
More importantly, it can check GitHub and GitLab projects and has the cmd operator, which allows us to use the Helm repository manager to get the latest helm chart.

Nvchecker requires a config file and a version file. Let’s create these for our software stack described earlier.

In the config file source.ini we use a couple of different methods:

pypi: the latest release version of the python package Jupyterlab.
github: the latest release or highest tag from the Github url. We can use a regex to skip alpha and beta releases.
cmd helm: use the helm cli to find the latest helm chart release, this has as dependency that the target repository has been added and helm is updated.

Setting this up is a tedious task, but will save you time later on.
It is important to keep this up to date when adding new projects. They will easily be forgotten.

The version file looks very similar to our yaml version file, but unfortunately it is not quite the same. To prevent having to keep two versioning files up-to-date we generate current_versions.txt from the versions.yaml file.
We also need to add and update our helm repositories. We created a script that updates helm, generates a version file and runs Nvchecker.

The sed command takes all lines that denote a version and removes the semicolon. The script gives us the following output:

It will only list projects that have a new release. A complete overview is generated in new_versions.txt. We can now update our versions.yaml and deploy these to the development/test cluster.

Further automation

Having a single version file and checking for new versions automatically helps a lot. However, there are still a lot of manual steps left, that can be automated. Flux already made some progress with automating Helm releases by defining a HelmRelease custom resource. The operator watches the release field and automatically deploys new releases when available. With the GitOps way of working we can use version scanning to automatically deploy and test new versions in the cluster. The following is a theoretical automated release process, which I will work on in the future.

Set up an application version_manager in the dev environment that continuously checks for new software versions in our stack.
The version_manager creates a branch in the git repository with the version(s) updated.
The GitOps application will automatically deploy the new version(s).
The version_manager then runs some tests, and makes a merge request if all tests passed. It will create an alert and a WIP merge request if tests fail for the new version.

Using this setup most of the patching will be fully automated. With breaking changes a new story can be created automatically on your favourite project board so no update is ever forgotten.

Takeaways

Patching software is important for all development teams.
Application deployment on Kubernetes is not straightforward, which makes automated patching difficult.
We can simplify patching by using Nvchecker to automate version checking.
The GitOps way of working makes it possible to automate most of the patching, but can be difficult to set up.

About the author

Robbert van der Gugten is machine learning engineer at BigData Republic, a data science consultancy company in the Netherlands. We hire the best of the best in Data Science and Data Engineering. If you are interested in using machine learning techniques on practical use cases, feel free to contact us at info@bigdatarepublic.nl.

Thanks to Ruurtjan Pul, Steven Reitsma and Annieke Hoekman for proofreading this post.