Helm’s — atomic Option for Rollback Leaves You in the Dark

Akash Jaiswal
4 min readOct 27, 2023

--

Introduction:

The world of DevOps is an exciting journey full of exploration and problem-solving. As a DevOps enthusiast, I recently embarked on a new adventure into the realms of Kubernetes deployments. My trusty companion for this journey was Helm, the Kubernetes package manager. It promised seamless application deployments, and my initial experience was smooth sailing.

I harnessed the power of Helm charts to deploy my applications in a Kubernetes cluster, and I had a nifty command at my disposal:

helm upgrade - install -f values.yaml <release-name> . - atomic - timeout <timeout-sec>

This command made my deployments effortless and reliable.

The ` — atomic` option acted like a safety net, ensuring that if anything went wrong during deployment, Helm would gracefully roll back to the previous version. It was a reassuring feature, much like a safety rope while climbing a steep mountain.

The Challenge: Unveiling the Darkness in Helm’s — atomic Option

Imagine a scenario where you’ve implemented a safety mechanism in your CI/CD pipeline. If a Kubernetes pod enters a crash loop for 300 seconds, Helm triggers an automatic rollback to the previous version if we use the atomic option. This is an essential fail-safe mechanism. The problem is that, in some cases, even after a successful rollback, if the subsequent deployment attempt fails, the pipeline may still be marked as successful in case of rollback. This can be confusing for users who might not immediately realize that Helm executed a rollback due to a failed deployment.

Pipeline will always be successful because it has successfully executed helm command with an automatic rollback option. Without — atomic option it might fail. The pipeline doesn’t know about helm rollback feature it only know about command that you’re running is successfully executed or not.

Our Solution: Handling Rollbacks with Finesse

To tackle this issue, we’ve refined our Helm deployment script to enhance the pipeline’s clarity and user-friendliness. Here’s a breakdown of the improvements:

  1. Deployment Command Optimization: We’ve updated our deployment command to include the --wait and --timeout flags. These additions ensure that Helm waits for the deployment to complete and specifies a timeout period for the deployment.
helm upgrade --install -f values.yaml {{Release_Name}} . --wait --timeout {{deployment_timeout_second}}

2. Tracking Helm Exit Status:

It’s crucial to track the Helm deployment exit status. We retrieve the exit status and use it to determine the success or failure of the deployment.

helm_upgrade_exit_status=$?

3. Handling Rollbacks:

In the event of a deployment failure, we’ve implemented a rollback mechanism. Here’s how it works:

  • If the Helm deployment fails (indicated by a non-zero exit status), we initiate a rollback to the previous version.
  • If the rollback is successful, we perform additional actions to address the issue that caused the rollback. This may include restarting the deployment and checking the rollout status.

Here’s the script that became my guiding star:

helm upgrade --install -f values.yaml {{Release_name}} . --wait --timeout {{deployment_timeout_second}}
# Capturing Helm's Exit Status
helm_upgrade_exit_status=$?
echo 'Helm release failed with EXIT_CODE: '$helm_upgrade_exit_status''

# Verifying the Helm upgrade/install status
if [ $helm_upgrade_exit_status -eq 0 ]; then
echo "Helm upgrade/install successful...."
else
# When Helm's release fails, it's time to set the sails for a rollback
echo "Helm upgrade/install failed. Rolling back to the previous version..."

# Initiating the Helm rollback
helm rollback {{Release_name}} 0 --wait --timeout {{deployment_timeout_second}}
if [ $? -eq 0 ]; then
echo "Helm rollback completed..."
echo "Action Required: Check your deployment on the K8s cluster to understand why it rolled back."

# We exit with a non-zero status code to indicate pipeline failure
exit 1
else
echo "Helm Rollback also failed ..."
exit 1
fi
fi

This script brought a new dawn to my Helm deployments. It ensured that every successful rollback was marked as “incomplete,” with a clear indication that something had gone awry.

Impact: Improved User Experience

These enhancements have had a positive impact on our CI/CD pipeline:

  • Rollbacks are now explicitly marked as failures, avoiding confusion.
  • Users receive custom notifications that keep them informed.
  • Our CI/CD pipeline is more robust and user-friendly.

Conclusion

In the world of DevOps, transparency and accuracy are paramount. This journey with Helm and a custom script helped me bring clarity and precision to my CD pipeline. Rollbacks were no longer silent heroes; they were vividly marked as a part of the story.

As I continue to explore the ever-evolving landscape of Kubernetes deployments, I share this tale in the hope that it will illuminate the path for fellow DevOps explorers. With Helm as your trusty companion and a custom script as your navigator, you can sail confidently through the seas of CD deployments, ensuring that the story of your pipeline is as complete and transparent as the journey itself.

May your Helm deployments always be clear, and your CD pipelines shine brightly.

Happy Helm deploying!

--

--