Helm 102

Published in

Geek Culture

5 min readAug 4, 2022

Here are some steps that you can implement in your helm chart to make it more robust after you have setup your Kubernetes and created a Helm chart for your application deployment.

Wait for the deployment to succeed before marking the workflow as complete.

When you run a Helm deployment (deploy roll), you should make sure the deploy was successful. By default Helm, just like Kubernetes, doesn’t check that.

However, Helm offers the --wait flag that waits for the release to successfully deploy all resources.

if set, will wait until all Pods, PVCs, Services, and minimum number of Pods of a Deployment are in a ready state before marking the release as successful. It will wait for as long as --timeout

This flag has had multiple issues which seems to have resolved but the below two inconveniences persist with the wait flag — more detail can be found here:

If the pod is “flapping” (it goes ready for a bit and then fails) helm might pass the workflow considering the tiny amount of time for which the replica was ready, which would be a false positive in our case and this usually happens when your application has logical errors that makes it work for a while and then it crashes.
If you only have 1 replica and you’ve specified that 1 replica can be unavailable, then the deployment is technically “ready” if 1 pod is unavailable.

Solution

The solution to second issue is pretty easy, either have more than 1 replicas or change max unavailable config to 0. Although, setting max unavailable config to 0 will cause hinderance in performing rolling updates and so will having one replica, so the best way to resolve this would be to keep a minimum of two replicas.

For the first issue you want to make sure that helm understands exactly when the pod is ready and don’t just rely on the pod status say ‘running’. You can achieve this by adding probes to your applications. Below is a sample code, you can read further about probes in this official document.

kind: Deployment
spec:
  template:
    spec:
      containers:
        livenessProbe:
          httpGet:
            path: /health
            port: 9000
          initialDelaySeconds: 3
          periodSeconds: 3
        readinessProbe:
          httpGet:
            path: /health
            port: 9000
          initialDelaySeconds: 3
          periodSeconds: 3

Rollback automatically if the deployment fails

Sometimes you send out bad codes that crash during deployments and your system fails, such failed states can easily be rolled back manually to a stable helm version by using the below helm rolback command. This will get your pods to working state.

helm rollback <RELEASE> [REVISION] [flags]For eg: helm rollback release1 13

You can check the list of release with the history command or check the current release version with status command.

helm status release1 -n default
helm history release1 -n default

However, you might not want to do that manually every time and should integrate an automated system to gracefully rollback helm release when deployment fails. Helm offers the --atomic flag to do exactly that.

if set, upgrade process rolls back changes made in case of failed upgrade. The--wait flag will be set automatically if --atomic is used

Trigger deployment with every helm upgrade/roll

When a deployment is triggered in k8s, latest/specified image is pulled to use for pod deployment but if the latest/specified image tag is the same as the one currently running pods are using, a new roll is not triggered.

This can be confused with the function of the Always pull policy defined in the deployment as was raised in this thread.

Kubernetes is not watching for a new version of the image. The image pull policy specifies how to acquire the image to run the container. Always means it will try to pull a new version each time it's starting a container. To see the update you'd need to delete the Pod (not the Deployment) - the newly created Pod will run the new image.
There is no direct way to have Kubernetes automatically update running containers with new images.

To force roll a deployment you can restart a kubectl deployment rollout by using the below command.

kubectl rollout restart deployment/<deployment-name> -n <namespace>

However, helm provides a better way of resolving this by adding a random string/timestamp as annotation to your deployment so it always changes and causes the deployment to roll every time you run helm upgrade. You can refer this thread or official helm documentation for further understanding.

kind: Deployment
spec:
  template:
    metadata:
      annotations:
        timestamp: {{ now | quote }}[...]

Each invocation of the template function will generate a unique random string. This means that if it’s necessary to sync the random strings used by multiple resources, all relevant resources will need to be in the same template file.

Wait for a deployment to finish before running another upgrade

If you try to run the helm upgrade command on the same Helm Chart where another deployment might be ongoing, helm will throw the below error.

Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

You should not be running multiple upgrades on the same chart to maintain the consistency of pods but at the same time you do not want to sit around for a deployment to finish before you begin yours, sounds like a lot of unproductive minutes.

To handle this, you can add a check in your workflow for the deployment to wait until the current finishes and automatically start the new upgrade. Unfortunately, helm does not provide any command that waits for a deployment to finish; but Kubernetes does. Thus, while you cannot wait for helm rollout to finish, you can wait for specific deployments to finish which is a sub-part of the helm deploy.

The kubectl rollout status command will output if a rollout has been successfully deployed or continue to serve the updates till it does. Below are a couple examples of how the command works.

$ kubectl rollout status deploy stage-backend deployment "stage-backend" successfully rolled out------------------$ kubectl rollout status deploy stage-backendWaiting for deployment "stage-backend" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "stage-backend" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "stage-backend" rollout to finish: 1 old replicas are pending termination...
Waiting for deployment "stage-backend" rollout to finish: 3 of 4 updated replicas are available...
deployment "stage-backend" successfully rolled out--------------------$ kubectl rollout status deploy stage-backend Waiting for deployment "stage-backend" rollout to finish: 2 out of 4 new replicas have been updated...
error: deployment "stage-backend" exceeded its progress deadline

This approach can become unreliable at times, like if you have multiple deployments under single helm chart or when code crashes and helm starts to rollback — technically the deploy roll hasn’t finished but the kubectl rollout will exit saying it failed before the rollback starts. However, it is possible to work around this caveat by creating a script that continuously polls the status of the chart deployment using the helm status sub-command.

$ helm status release1 | grep STATUS | cut -d: -f2
deployed

That being said, an easier solution to this problem would be to not let multiple instances of the upgrade job run in your CI/CD pipeline. For eg, if you use Github Actions you can use the concurrency keyword in your workflow to ensure that the upgrade workflow only runs when there is no other running and if there is it’ll send the current one to a pending state which will be executed after the previous one finishes. You can also specify concurrency at the job level. For more information, see jobs.<job_id>.concurrency.

You can learn more about this feature from this official Github blog.