3-way merge in werf: deploying to Kubernetes via Helm “on steroids”

Flant staff
Nov 26, 2019 · 10 min read

This is something we’ve been waiting for so long: werf, our Open Source tool for building and deploying applications to Kubernetes, from now on supports applying changes using 3-way-merge patches! Also, now it is possible to adopt existing K8s resources to Helm releases without the need to recreate those resources.

In short, you can set WERF_THREE_WAY_MERGE=enabled and have a kubectl apply-like deploying process compatible with existing Helm 2-based installations (and even a little more).

But let’s start with a theory. What are three-way-merge patches? How they’ve been invented, and why they are so essential for CI/CD processes with a Kubernetes-based infrastructure? And later, we will discuss the 3-way-merge process in werf, what modes are used by default, and how can you manage all this stuff.

What is a 3-way-merge patch?

We’ll start with the task of deploying resources described in the YAML manifests to Kubernetes.

The Kubernetes API provides the following basic commands for working with resources: create, patch, replace, and delete. Our task is to arrange an easy-to-use continuous deployment of resources into a cluster with these commands. How to do that?

Imperative kubectl commands

The basic approach to managing objects in Kubernetes involves so-called imperative kubectl commands to create, modify, and delete those objects. Simply put,

  • you can start a Deployment or a job with a kubectl run command:
kubectl run --generator=deployment/apps.v1 DEPLOYMENT_NAME --image=IMAGE
  • kubectl scale allows you to set the number of replicas:
kubectl scale --replicas=3 deployment/mysql
  • and so on.

This approach may seem convenient at first sight. However, it has several difficulties:

  1. It is hard to automate.

Such an approach isn’t easily compatible with storing configs with application code. It’s not compatible with the whole infrastructure-as-code approach (IaC)… or even GitOps as the more modern alternative that is rapidly gaining popularity in the Kubernetes ecosystem. That is why this kubectl command-based approach has been abandoned.

Create, get, replace, and delete

The initial creation is very easy to implement: you just have to provide a manifest to kube-api, and the required resource will be created. You can store the YAML representation of the manifest in Git and use kubectl create -f manifest.yaml to create the resource.

The deletion is also easy to do: just insert the above manifest.yaml into kubectl delete -f manifest.yaml.

The replace operation allows you to replace the configuration of a resource with a new one without recreating a resource. This means that before making changes to a resource, you can request the current version with get, modify it, and update with replace. The kube-apiserver supports optimistic locking, and if an object has changed after the get operation, then the replace operation will fail.

To store the configuration in Git and update it with replace, you have to get it, merge the Git config with the config which we have got, and then execute the replace command. Normally, kubectl only allows you to use the kubectl replace -f manifest.yaml, where manifest.yaml is a fully pre-prepared (merged in our case) manifest that needs to be installed. It turns out that the user has to manually merge manifests (a non-trivial task, we should say).

It is also worth noting that while manifest.yaml is stored in Git, we cannot know in advance whether we have to create or update an object — this is the task for the user’s software.

Conclusion: can we organize a continuous deployment with create, replace, and delete directives while storing an infrastructure configuration in Git alongside the code and implementing user-friendly CI/CD process?

As a matter of fact, yes, we can. To do this, you will have to implement manifests merging along with some wrapper, which:

  • checks for the presence of an object in the cluster,

During the update, it has to take into account that the resource could change since the last get command. Also, the wrapper has to automatically make repeated attempts to update the resource in the case of optimistic locking.

But why reinvent the wheel if kube-apiserver already has another great way of updating resources by the patch command? Patching relieves the user of some of the above difficulties.

Patch

And so, we have come to the long-awaited patches. Patching is the mainstream method of applying changes to Kubernetes objects. The patch command works the following way:

  • the kube-apiserver user has to send a patch in the JSON format and specify an object;

The optimistic locking isn’t required in this case. This procedure is more declarative than replace (though, at first, it may seem the other way around).

Therefore we use:

  • create — to create an object by using Git manifest;

Crucial point: to make it all a reality, you have to create the right patch!

2-way-merge as the way patches work in Helm 2

During the first release installation, Helm performs a create operation on chart resources.

When updating a release, Helm:

  • creates a patch by comparing versions of the resource in the previous and current charts,

We will call such a patch 2-way-merge because it is based on two manifests:

  • the manifest of the resource from the previous release,

When deleting, the delete operation in the kube-apiserver is used for resources that have been defined in the previous release but aren’t defined in the current one.

The 2-way-merge-patch approach has a problem: it leads to a desynchronization between the real state of the resource in the cluster and the manifest in Git.

An example of this problem

  • Suppose that we have some manifest in the chart in Git with an image field in the Deployment section with value ubuntu:18.04.

In this case, our resource has lost synchronization with the manifest, as well as declarativity.

What is a synchronized resource?

It is impossible to achieve an absolute sameness between a manifest in the running cluster and a manifest in Git. In the real manifest, there may be service annotations/labels, supplementary containers, and other data which is dynamically added and deleted from the resource by controllers. We cannot (and don’t want to) store these data in the Git repository. However, during the deployment, the fields that we explicitly specified in the Git have to take appropriate values.

Here is a general rule for a synchronized resource: during the deployment, you can modify or delete only the fields that are explicitly specified in the manifest from the Git repository (or have been specified in the previous version and removed in the current one).

3-way-merge patch

The core idea of the 3-way-merge patching is to generate a patch by using the last applied version of the manifest from the Git repository and a target version of the Git manifest while taking into account the current version of the manifest from the running cluster. The final patch must obey the rule for a synchronized resource:

  • new fields that have been added to the target version — are added (via the patch);

This is the same principle kubectl apply uses for generating patches:

  • the last applied version of the manifest is stored in the annotation of the object itself,

Now that we have taken care of the theory, it’s time to discuss the recent developments in werf.

Applying changes in werf

In the past, werf (like Helm 2) used 2-way-merge patches.

Repair patch

To switch to a new style of patches — 3-way merge — we implemented the so-called repair patches in the first place.

When deploying, we use the standard 2-way-merge patch. However, werf generates an extra patch that synchronizes the real state of the resource with the state specified in Git (this patch is created by using the rule for synchronized resource described above).

In the case of desynchronization, before deploying is complete, the user receives a WARNING message containing a patch, which he has to apply to bring the resource to a synchronized form. Also, this patch is saved to the werf.io/repair-patch annotation. The user must manually apply the repair patch: werf won’t do it for the user as a matter of principle.

The generation of repair patches is a temporary measure that allows you to test the creation of 3-way-merge patches (it does not automatically apply them). Currently, this mode of operation is enabled by default.

3-way-merge patches for new releases only

Starting on December 1, 2019, beta and alpha versions of werf will start by default using full-fledged 3-way-merge patches to apply changes in new Helm releases deployed via werf. Existing releases will continue to use the 2-way-merge + repair patch approach.

You can enable this operating mode right away by setting WERF_THREE_WAY_MERGE_MODE=onlyNewReleases.

Note: This feature has evolved over several werf releases. It has been declared ready starting with v1.0.5-alpha.19 in the alpha channel, and with v1.0.4-beta.20 in the beta channel.

3-way-merge patches for all releases

Starting on December 15, 2019, beta and alpha versions of werf will start using full-fledged 3-way-merge patches to apply changes in all releases.

You can enable this operating mode right away by setting WERF_THREE_WAY_MERGE_MODE=enabled.

What about the autoscaling of resources?

There are two types of autoscaling in Kubernetes: HPA (horizontal), and VPA (vertical).

HPA scales the number of pod replicas, while VPA allocates more (or less) resources to existing pods. Both the number of replicas and resource requirements are specified in the resource manifest (see spec.replicas or spec.containers[].resources.limits.cpu, spec.containers[].resources.limits.memory, etc).

But here is the problem: If the resource in the chart is configured so that it contains specific values for resources or replicas, and autoscaling for the resource is enabled, then these values will be reverted back to values specified in the chart’s manifest with each deployment.

This problem has two solutions. First of all, you should not explicitly define autoscaling values in the chart’s manifest. If this variant isn’t suitable for some reason (e.g., because it is convenient to specify initial resource limits and the number of replicas in the chart), then werf provides the following annotations:

  • werf.io/set-replicas-only-on-creation=true

Having this annotation, werf will not reset the corresponding values with every deployment — it will only set them during the initial creation of the resource.

For more information, see the project documentation for HPA and VPA.

Disabling 3-way-merge patches

Currently, the user has the option to disable a new type of patches in werf by setting WERF_THREE_WAY_MERGE_MODE=disabled. However, starting on March 1, 2020, this option will become deprecated, and only 3-way-merge patches will be supported.

Adopting resources in werf

While working on applying changes with 3-way-merge patches, we have decided to implement the adoption of the existing resources in the cluster into the Helm release as well.

Helm 2 has a problem: you cannot add a resource to a chart manifests that already exists in the cluster without re-creating this resource from scratch (see issues #6031, #3275). We have taught werf how to adopt existing resources to the release. To do so, you need to install an annotation on the current version of the resource from a running cluster (e.g., via kubectl edit):

"werf.io/allow-adoption-by-release": RELEASE_NAME

Now you have to describe the resource in a chart. Next time werf will be deploying the release with the relevant name, the existing resource will be adopted by the release and will remain under its control. Moreover, during the process of adopting the resource by the release, werf will bring the current state of the resource from the running cluster to the state described in the chart using the same 3-way-merge patches and the synchronized resource rule.

Note: The WERF_THREE_WAY_MERGE_MODE environment variable does not affect the adoption of resources. In the case of adoption, 3-way-merge patches are the only choice.

You may find additional details in the documentation.

Conclusions and plans

I hope this article has clarified some aspects of 3-way-merge patches and the reasoning behind their implementation. From a practical standpoint, their integration has become another step towards improving the Helm-like deployment process. Now you can forget about the complexities of configuration’s synchronization that used to plague Helm 2. Meanwhile, we added a new and useful feature: from now on, Helm releases can adopt deployed Kubernetes resources.

There are still some problems with a Helm-like deployment process, such as the usage of Go templates. We will keep on working on them.

Here you can find additional information on methods of the resource update and adoption.

Helm 3

Since the new major version of Helm — v3, has been released just two weeks ago and also brings 3-way-merge patches, it deserves special mention here. The new Helm version requires migrating existing installations to convert them into a new format of storing releases.

For its part, werf has already abandoned Tiller altogether, switched to 3-way-merge patches, implemented many other useful features, while staying compatible with the existing Helm 2 installations (no need to run any migration scripts). Hence, werf users can enjoy all the advantages of Helm 3 over Helm 2, even if werf isn’t yet wholly shifted to Helm 3.

However, the transition of werf to the Helm 3 codebase is imminent and will happen shortly. It is expected to occur with werf 1.1 or werf 1.2. The current version of werf is 1.0, additional details on the werf versioning scheme are available here. Thus, Helm 3 has time to get stable.

This article has been originally written by our system developer Timofey Kirillov. Follow our blog to get new excellent content from Flant!

Flant

Professional DevOps outsourcing services with a strong passion for Kubernetes.

Flant staff

Written by

Flant

Flant

Professional DevOps outsourcing services with a strong passion for Kubernetes.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade