Kubernetes Canary deployments š¤ for mere mortals
And how to rollout software releases safely and surely
--
Originally posted on the Dockbit blog
Canary deployment is a pattern for reducing risk involved with releasing new software versions. The history behind the name aināt really pretty. But in software, releasing canaries can be a strategic tactic for teams adopting continuous delivery practices.
The idea is that youāll rollout a new release incrementally to a small subset of servers side by side with its Stable version. Once you test the waters, you can then rollout the changes to the rest of the infrastructure.
This effectively exposes new versions to a small set of users, which acts as an early indicator for failures that might happen. Canaries are very handy for avoiding problematic deployments and angry users! If one canary deployment fails, the rest of your servers arenāt affected and you can simply ditch it and fix the root cause.
Historically, implementing Canary deployments was a painful thing to do, especially for small companies. Teams even had to implement techniques such as Features Toggles in the core of their apps to achieve similar behavior.
Technologies such as Docker containers š³ took us a step forward but we still lacked higher-level tooling for implementing advanced deployment techniques.
Enter Container Orchestrators and Schedulers š
Schedulers and Orchestrators are generally used when you start having a need to distribute your containers beyond a single machine, so itās yet another management layer on top of sorts. But thatās not necessarily the only reason why you should adopt them.
Just to name a few, here are some benefits you get āfor freeā by simply adopting a Scheduler for your containers: Self-healing, horizontal scaling, service discovery, load distribution, deployment management and a ton more.
This is a huge win already. If youāre shipping software in Docker images today, one of the features that will be immensely helpful in your daily workflow is the management of container deployments. Gone are the days where you have to manually orchestrate pulling new versions of Docker images on different nodes and restarting services. An orchestrator can distill that to a one-liner for you.
While there is a plethora of options in the market: Docker Swarm, Hasicorp Nomad, Marathon; one Scheduler that truly focuses on making Docker container deployments repeatable and streamlined is Kubernetes.
Kubernetes (k8s for short) awesomesauce lies in the fact that it provides advanced deployment techniques off the shelf. Lots of scenarios are supported with no or minimal hacking. Now yes, you do have to encapsulate your software inside Docker images and setup your Kubernetes objects (Pods, ReplicaSets, Deployments, etc..), but you get so much back for your investment.
Seeing Canary deployments in action is way more exciting than talking about them, so letās get our hands dirty and dig into an example setup on top of k8s.
Overview
So hereās the master plan: First of all, we need a k8s cluster to work with. A Kubernetes cluster is composed of a dozen or so components. In order to bootstrap things quickly, weāll focus on automation tools for creating the cluster instead of creating it the hard way.
Minikube shall serve you extremely well for local testing. To get a more realistic setup, weāll be working with Google Container Engine (GKE) which is powered by k8s. Itāll take care of setting up instances on GCP, installing all of the k8s components as well as handle initial configuration.
Deployment to k8s requires that our apps are packaged in container format. For this walkthrough, weāll be using the Docker runtime. Both versions of the sample app (Stable and Canary) will be packaged up in Docker images with distinct versions as Docker image tags.
Weāll start first with a simple setup for the rollout of a Canary release on k8s then upgrade it to use more tricks.
Phase 0: Prepping the Environment
Hereās what you need to get started with k8s on GCP:
- Select or create a project. This separates all of the resources used so that itās easier to clean things up later.
- Enable billing for the project.
- Enable APIs for Compute Engine and Container Engine. Those shall allow us to access the resources needed using the gcloud CLI tool.
- If you havenāt already, install the latest version of gcloud. Alternatively, you can use Cloud Shell.
- Finally, install
kubectl
by executinggcloud components install kubectl
.
Next, we can proceed by creating a Container Engine Cluster. The cluster consists of at least one master and multiple worker machines, which will be created for us automagically.
- Grab the Project ID and store it in a variable:
export PROJECT_ID=<ID>
. - Set the Project to use:
gcloud config set project $PROJECT_ID
. - Set the default Compute Engine zone, for example:
gcloud config set compute/zone us-east1-d
. - At last, create the Container Cluster, weāre calling it
canary
here:gcloud container clusters create canary
.
If things go according to the plan, youāll be able to get the k8s components endpoints by executing: kubectl cluster-info
.
The code/configuration used in the upcoming material is available in a GitHub repository:
Phase 1: The App
First of all, we need an app to deploy! For the sake of this post, we created a simple Go app. Go programs compile into self-sufficient portable binaries, so they are perfect for baking inside a Docker image.
The same approach will work with any other tech stack that can be packaged in a Docker container. Be it a language that needs compilation such as Elixir or even interpreted languages such as Ruby or Python. Just that you need to include the code itself rather than the artifact in the latter case.
The app prints out the version deployed as well as some operational requirements:
Lines 8ā10: Stores the version of the application in a constant named version
and prints it out via Goās http package. Notice that version
is set to 1.0
, which shall be the first āstableā version weāll bake inside the Docker image later.
Lines 14ā15,24: Acts as a health check, if a request hits /health
, the app will respond with HTTP 200.
Lines 18ā20,25: Exposes an endpoint at /version
which returns the version of the application currently deployed. This will make it easier later to confirm which version of the app is running on k8s.
Line 26: Finally, we listen on port 8080
for any incoming HTTP requests.
Now we can proceed to compile the app by building the binary:
Youāll notice that the netgo
build tag is used. This helps to produce a static binary by substituting systemās libc for Goās netgo
library.
In case you donāt have Goās build setup locally, you can use the pre-built binaries located here. A downloadable version of version 1.0
is available via this link.
Moving forward, weāll need to encapsulate the Go binary into a Docker image that we can later deploy to k8s.
Phase 2: The Docker Image š³
Wrapping up the Go artifact in a Docker image is straightforward. Itās just about copying the binary into the image.
Youāll notice that we are using Docker build-time arguments to specify which version of the app to package inside the image.
Since our k8s setup is on top of GCP, we can utilize Google Container Registry (GCR for short) for storing the images. The URLs change based on the project ID we set earlier.
One of the great things about Container Engine is that itāll automatically pre-configure the Docker registry credentials on the k8s worker nodes. So k8s will have access to the images in GCR out of the box.
By this point, we have an app that is properly packaged in a Docker image that k8s can consume and deploy. Time to dive headfirst into some Kubernetes.
Phase 3: Deploying Stable to k8s
There are some k8s terms that we need to be familiar with before we dig into the code. First and foremost: Pods.
A Pod is the basic building block of Kubernetesāthe smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents a running process on your cluster.
A pod (as in a pod of whales or pea pod) is a group of one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers. A podās contents are always co-located and co-scheduled, and run in a shared context
In practice, you mostly care about defining the āspecsā of your Pods. Those can include which container image to use, health checks, what command to run inside the Pod, which port to expose, etc. For most workloads, youāll have multiple instances of the same Pod where the workload will be distributed across them using some form of load balancing, which brings us to the concept of ReplicaSets.
ReplicaSets (RS for short) ensures that a specific number of pod replicas are running at any one time. You simply tell k8s that you need an x number of replicas of some Pod and itāll make sure that the exact same number is always alive according to the Podās lifecycle.
When it comes to writing k8s configuration though, you rarely have to define ReplicaSets but rather use Deployments.
A Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to pods along with a lot of other useful features
You describe a desired state in a Deployment object, and the Deployment controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
Hereās an example of defining a k8s Deployment that uses the Docker image weāve built earlier:
Lines 1ā2: The kind of Kubernetes Object we want to create and which API version to use for creation.
Lines 3ā4: Metadata to uniquely identify the Object.
Lines 5ā12: Represents the beginning of a Pod Template.
Pod templates are pod specifications which are included in other objects.
Rather than specifying the current desired state of all replicas, pod templates are like cookie cutters. Once a cookie has been cut, the cookie has no relationship to the cutter.
In this example, we are configuring a template to have 3 copies/replicas of the same container, as well as attaching some Labels that can be used to reference the Deployment object. One important label in this setup is app: kubeapp
. By setting the application name to be attached to the Pods, we can then point to the Pods later from a load balancer for instance.
Labels are key/value pairs that are attached to objects, such as pods. Labels can be used to organize and to select subsets of objects.
Lines 13ā25: Defines the specs of the containers that k8s will run in Pods as part of the ReplicaSet.
- The containers are based off the Docker image
gcr.io/PROJECT_ID/app:1.0
which is stored in GCR. WherePROJECT_ID
shall be evaluated as demonstrated below. imagePullPolicy
tells k8s to always pull the image from the Docker registry in case a newer image was pushed with the same tag.readinessProbe
defines a health check to run against the Pod (calling/health
on port 8080 in this case).command
sets the command to run in the container once it launches. Here we point to the location of the Go binary at/app
.ports
open port 8080 so that the container can accept and send traffic.
Before we proceed to execute the Deployment, we need to parse PROJECT_ID
, we can use good old sed
for that:
Itās also a good idea to create a k8s Namespace for the Deployment to reside in. Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces.
One example where Namespaces fits very nicely is deployment environments. So you can end up with a Namespace for production and another one for staging in the exact same k8s cluster. Hereās how to create a Namespace:
Now we can finally proceed by rolling out the Deployment š
Notice how we need to specify which Namespace to use. All k8s resources created via the Deployment will reside in the production
Namespace. So in order to access them, we need to always add the--namespace
argument. We can check the progress of the rollout via the rollout status
command:
Yay! š
The get
command can be used to pull information about all k8s Objects. To get the list of Pods created as a result of the Deployment, we can run:
Looking closely, youāll find that k8s distributed the Pods across 3 worker Nodes.
Another command that can also be very useful is get events
. This extracts a stream of events from k8s controllers which can be handy for troubleshooting and seeing what happens under the hood.
By this point, 3 copies of the application are up and listening on port 8080
for incoming connections. But we are missing the final component to complete the setup which will allow us to access the app: A load balancer.
k8s has support for creating cloud-native load balancers (LB for short). It does so by creating a load balancer (think GCP LB or AWS ELB) as a Service. In practice, Services can mean different things. The main difference between Pods and Services is that Pods are mortal (so if they respawn, network addressing can change), Services on the other hand, can be identified by attaching Labels to them for later reference with no worries about addressing.
Hereās how to define a load balancer as a Service that points to the Pods created earlier:
Lines 1ā4: Specifies that we are creating a k8s Object of type Service, using k8s API version 1 along with metadata.
Line 6: Sets the ServiceType to be a load balancer. This exposes the service externally using the cloud providerās load balancer. In our case, that will create GCP Load Balancer and automagically create all routing necessary to connect the workloads running inside the Pods to the external world via LB.
Lines 7ā11: Represents LB configuration. In this case, LB is listening on port 80
directing the requests to TCP port of 8080
on the backend.
Lines 12ā13: Introduces us to yet another important k8s concept: Selectors. They work in conjunction with Labels to identify and group k8s Objects. Earlier, we assigned the label of app: kubeapp
to our Pods. In the LB definition, we can simply use a selector on the very same label so that the LB can auto-discover which Pods it can direct traffic for.
Creating the LB uses the same apply
command that we used earlier for creating the Deployment:
Cloud LBs are created asynchronously, once it is done, you can grab the public IP addresses of the LB using the get
command:
We can also extract the IP address of the LB using this one-liner š
Awesome! Version 1.0
(stable) of our app has been deployed š. Two k8s config files and a handful of commands took our Docker image, deployed 3 distributed containers, created a LB and connected things together, not bad at all!
Phase 4: Deploying Canary to k8s š¤
Now that we have the stable version deployed to the cluster, we can dig into deploying a Canary release. Letās name it 2.0
, weāll be following a similar flow. The difference between 1.0
and 2.0
will only be the value of the Go constant.
- Build Go binary for
2.0
(or download a pre-built binary here).
- Build and push Docker image for
2.0
.
- Rollout the Canary release by applying a slightly different k8s configuration than the one we had for Stable:
Letās highlight the differences:
Lines 4, 12: Unique metadata to distinguish the Canary Deployment from Stable. Notice how the label named env
is different but both Stable and Canary share the same value of kubeapp
for the app
label.
Line 6: Sets the numbers of replicas we want for the Canary release. This is important since it controls the ratio/percentage of how much of our users are going to hit the Canary release. In this case, we have a 3:1 ratio for stable:canary, so 25% of requests hitting the load balancer will get the Canary release.
We donāt need to do any changes to the LB since it already has a selector applied to the label app: kubeapp
which spans across both deployments (Stable and Canary). So Pods launched via the Canary Deployment will auto-join the LB.
Itās time to release the Canary š¤
We can simulate users hitting the app by executing curl in for loop over the previously set SERVICE_IP
:
Once 2.0
is tested and approved, we can proceed by switching the Docker image used in app-production.yml to 2.0
and then simply re-apply the configuration. Another route would be to save the image used directly to the k8s API:
This will instruct k8s to launch new pods while terminating the old ones. By this point, you donāt really need the Canary deployment. You can go ahead and delete it by executing kubectl --namespace=production delete deployment/kubeapp-canary
.
Once you start testing the Canary release, one problem youāll bump into pretty fast is that the LB might start switching your session between both versions, making it pretty hard to test 2.0
separately. Also, implementing operational requirements like SSL termination or routing rules isnāt really feasible with the LBs that k8s offers.
Enter Ingress resources.
Phase 5: Using Ingress resources to access Canary Deployments
An Ingress is a collection of rules that allow inbound connections to reach the cluster services.
It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting, and more.
Ingress resources are provided in k8s by an Ingress Controller. Those controllers abstract away the inner workings of the backend software used for implementing advanced load balancing (could even be nginx, GCE LB, HAProxy, etc) and let us focus on the routing rules.
The Ingress Controller is not usually created automatically in the initial cluster creation, though the cool thing is that they can be deployed as k8s Pods. As for the GKE setup we have in place, it is actually already there, so we can jump right into how to use Ingress resources.
Our goal with Ingress resources would be to split the traffic of the Canary deployment to be accessible on a subdomain. This will make it way easier to test the Canary release separately from the Stable copy.
Letās assume that the Stable version is accessible at http://foo.bar
and the Canary deployment is at http://canary.foo.bar
. Youāll need to add records for those domains in your local resolver (a.k.a: /etc/hosts
), directing traffic to the public IP of the Ingress resource.
Before you execute the commands in the next section, you might need to delete the resources created earlier. Ditching the Namespace and recreating it could be achieved by this task:
kubectl delete namespace production && kubectl create namespace production
Letās start by exploring what changes we need to do in the Deployment itself:
The Deployment object is as is, nothing new here. New config lies in the 2nd section (Starting line 27). Here we are basically publishing the Pods created by the Deployment via a k8s Service of type NodePort
.
Kubernetes master will allocate a port from a flag-configured range (default: 30000ā32767), and each Node will proxy that port (the same port number on every Node) into your Service.
Our Deployment is exposing port 8080
, so the Service points to that port and exposes port 80
on the Nodes themselves. Ingress resources canāt talk to the Deployment object directly, but rather need a Service to publish the Deployment first.
Similarly to what we did previously, here goes the flow for rolling out the Deployment + Service combo:
Querying the k8s API shall show us that Pods are collectively exposed as a Service that listens on an internal IP address. This is also done by the means of a Selector that points to the Label used in the Pods template.
Once the Service is published, we can start creating our Ingress resources. Below we are defining a minimal resource:
As with all other k8s Objects, type is set to Ingress
along with metadata. Ingress resources need to be aware of which backends to route requests to, we only have one backend here pointing to the Service defined earlier.
In this situation we are not listing any routes which mean that this will represent the default backend, all requests shall go to this backend. Once the healthchecks on the backend starts reporting HEALTHY
, we can hit the Ingress address and start testing our setup.
Take note of line 16, we need to wait for the backends to report HEALTHY before weāre able to access the Ingress resource IP address.
The Canary release is very similar to Stable. The only difference would be the number of replicas, the Docker image to use (2.0
in our case) and related metadata. Rolling the deployment can be done using the usual flow:
For us to be able to access the Canary release on a separate subdomain, we need to add some Ingress routes that point to the Canary Service. The approach weāre using here is name-based virtual hosting, where the Ingress resource will check the host header and judge which service it should send the traffic to.
foo.bar --| |-> kubeapp-production-service 80
| Ingress IP |
canary.foo.bar --| |-> kubeapp-canary-service 80
Lines 6ā8: Sets a default backend. This will match any requests that lands on the IP address of the Ingress with no host headers set, directing it to the production Service that containers the production Pods.
Line 9: Starts a section for the Ingress resource rules.
Lines 10ā15: Matches the domain canary.foo.bar
and directs the traffic to the k8s Service named kubeapp-canary-service
on port 80
.
Lines 16ā21: Does the same for the main domain weāre using for our Service foo.bar
.
This technique is very useful for testing a Canary release in production environments without affecting the Stable version being consumed by your users. The same Ingress routes can be modified to work on different URL paths on the same domain instead of using vhosts (a.k.a fanout).
Promoting the Canary version to production can be executed using the same flow discussed earlier: Either use set image
or modify the Docker image to use in the production config.
So here you go, putting a Canary deployment setup on top of k8s isnāt that hard after all. Things can even be further developed with Multi-stage Canary deployments and more refined control of traffic weight.
Conclusion
Kubernetes is a gigantic project that allows companies of all shapes and sizes to adopt the infrastructure fabric that used to be only available to large corps (k8s is inspired by borg after all). Best of all, the effort needed to implement similar setups is relatively easy compared to other systems.
This post just scratched the surface of one use case. k8s has something for everyone, dig through the concepts and see where it can fit for you!
Keep āem flying! š¤ š