How Groupon deployed hundreds of services using Kubernetes, Helm Chart, and Krane
At Groupon, our Cloud Migration strategy required us to move our services to a Kubernetes-based cloud platform. This article describes how we moved to Kubernetes, the problems we faced, and how we solved them using different tools and technologies.
We wanted a centralized solution that is modular, reusable, consistent, manageable, and versioned so the service team can use it without worrying about its underlying implementation.
We have 500+ microservices of varying tech flavors. If each individual service maintains their own Kubernetes templates for deployment then it will be a huge overhead because each team will implement their own solution which does not scale at the Groupon level.
We did not want to reinvent the wheel. Instead, we leveraged the solutions implemented and well tested by one team for others having similar use cases.
We chose Krane for our implementation because it was written in Ruby and easy to integrate with our existing internal automation tooling.
What is Krane?
Krane is a command line tool written in Ruby that helps you ship application changes to a Kubernetes namespace and understand what exactly happened during the deployment and status of each of the resources used in the deployment.
Krane provides a “Krane render” command which takes templates written in Ruby and binding files as input and generates Kubernetes manifest files.
The binding file contains values you want to pass dynamically to the templates which may be different for each service.
Krane provides the “Krane deploy” command which deploys generated Kubernetes manifest files in the Kubernetes cluster.
How does it work?
There is a centralized team that develops and maintains templates for Kubernetes deployment here at Groupon. This team has created templates for different components like API (Kubernetes Service), Worker (Background Task Processors), Cron (Kubernetes CronJob) for different frameworks like Java, Ruby, etc. based on major Groupon tech stacks.
The team provides and maintains an automation tool that copies templates and default binding files for specified components to the services repository from the centralized repository. Service teams run this tool and overwrite service-specific values in binding files.
When the service gets deployed, Krane will take care of generating Kubernetes manifests and deploy them in clusters.
Take a look at the below snippet to understand it better.
Challenges with Krane
With 500+ services, we started facing the following issues while working with Krane:
- Service teams need to unnecessarily copy templates to their repository rather than consume them as modules.
- It may create issues in the deployment if there are any unsupported alterations in template files.
- They need to update the copy of the templates every time new version of templates gets released.
- If any custom changes are made in local templates by service teams then a template version upgrade will overwrite it which is a tedious task to put back custom template changes.
Development using Helm Chart
Helm Charts is a tool to develop, manage and package Kubernetes deployment templates in a single chart with all its dependencies.
Why Helm Chart?
We did not choose helm in the first place because it was new at that time with a high learning curve, low adoption rate, and complex client-server architecture.
Helm 3.0 was released with more simplified client-server architecture and with lots of improvements and so we decided to go with Helm which also helped to solve issues we were facing with Krane.
- It abstracts away template implementation details from users in the form of a chart.
- It supports inheritance so common charts can be extended to be used for other charts.
- Service teams don’t need to download or save templates on their repository which gives a consistent developer experience.
- It’s configuration driven so teams just need to configure values specific to their services.
- Service teams just need to specify chart name, version, and service-specific binding files instead of maintaining templates in their own repo.
- Services can upgrade or downgrade to different chart versions as per their requirements without any issue as templates are generated on the fly during deployment.
- Services will not be able to change templates so it reduces failure surface area for deployment, otherwise, it’s difficult to debug issues in templates if all the services make their own changes.
Helm provides a similar command “Helm template” to generate Kubernetes templates with binding files. It does not need template files; instead, it fetches the template files on the fly from the repository based on a specified chart and version as shown in the snippet below.
Deployment using Krane
You think I must be crazy… Again Krane?
You just mentioned, due to earlier issues mentioned with Krane, you moved to Helm charts. Why are you using Krane again?
We use Helm Charts for Development but Krane for Deployment
As mentioned earlier, we can generate Kubernetes manifests using the “Helm template” command. Though there are 3 ways we can deploy generated Kubernetes manifests to the cluster.
Deploy using “Kubectl apply”
We can deploy resources from Kubernetes manifests using kubectl apply but as you know it’s an async call from kubectl client so we need to poll constantly to verify all resources are deployed successfully.
Deploy using “Helm Install”
Helm Chart is used for chart development and in addition, it provides a feature to deploy charts in clusters using Helm install but it does not give insight about resources deployed and its status synchronously which is very critical to know for production systems.
Deploy using “Krane Deploy”
When we deploy resources using Krane, it understands deployment and provides insight about:
- Discovering resources in Kubernetes manifests.
- Checking status of resources in the cluster against Kubernetes manifests.
- Pre-deploying resources like configmaps, secrets, etc.. to be used by deployment resources.
- Responds success or failure message with the status of each resource getting deployed.
These are the main features why we are using Krane for deployment. There are many other features provided by Krane like:
- Run arbitrary tasks at the beginning of deployment.
- Global timeout for Krane deployment.
- Custom timeout for a specific resource using annotation.
- Deployment of global non-namespaced resources.
- Restart all pods using Krane restart.
- Run job outside of deployment using Krane run.
There is fantastic documentation available for Krane so please go through it to know more about Krane's features.
Example of Deployment using Krane (refer to the snippet below)
- It fetches a specific chart component called cmf-rails-api for version 3.28.1.
- Injects common and env specific binding files in the chart.
- Renders Kubernetes manifests using Helm.
- Deploys rendered Kubernetes manifests using Krane.
How did we automate it?
As mentioned earlier, we implemented an automation tool that immensely helped service teams to automate the following tasks.
- Set up environment-specific cloud configuration with default values.
- Set up deployment configuration for their service.
- Add/Remove different chart components based on service requirements.
- Upgrade/Downgrade chart version for their services.
- Render and validate Kubernetes templates based on their cloud configuration.
At Groupon, since we operate at significant scale with hundreds of services, we used the best of Helm and Krane to create a modularized and consistent templating engine for Kubernetes deployment.
This solution worked at Groupon because we simplified template development in the form of components based on different tech stacks we are using as building blocks for creating any new services. Otherwise, it will be difficult to maintain deployment templates for 10 different tech stacks.
Last but not least is automation. We implemented an automation tool that did all the heavy lifting work for service teams to reduce their burden and provide a stable, predictable production deployment experience to migrate all our services to our Kubernetes platforms.
Thanks for reading!