Canary deployment is a pattern for reducing risk involved with releasing new software versions. The history behind the name ain’t really pretty. But in software, releasing canaries can be a strategic tactic for teams adopting continuous delivery practices.
The idea is that you’ll rollout a new release incrementally to a small subset of servers side by side with its Stable version. Once you test the waters, you can then rollout the changes to the rest of the infrastructure.
This effectively exposes new versions to a small set of users, which acts as an early indicator for failures that might happen. Canaries are very handy for avoiding problematic deployments and angry users! If one canary deployment fails, the rest of your servers aren’t affected and you can simply ditch it and fix the root cause.
Historically, implementing Canary deployments was a painful thing to do, especially for small companies. Teams even had to implement techniques such as Features Toggles in the core of their apps to achieve similar behavior.
Technologies such as Docker containers 🐳 took us a step forward but we still lacked higher-level tooling for implementing advanced deployment techniques.
Enter Container Orchestrators and Schedulers 🚀
Schedulers and Orchestrators are generally used when you start having a need to distribute your containers beyond a single machine, so it’s yet another management layer on top of sorts. But that’s not necessarily the only reason why you should adopt them.
Just to name a few, here are some benefits you get “for free” by simply adopting a Scheduler for your containers: Self-healing, horizontal scaling, service discovery, load distribution, deployment management and a ton more.
This is a huge win already. If you’re shipping software in Docker images today, one of the features that will be immensely helpful in your daily workflow is the management of container deployments. Gone are the days where you have to manually orchestrate pulling new versions of Docker images on different nodes and restarting services. An orchestrator can distill that to a one-liner for you.
While there is a plethora of options in the market: Docker Swarm, Hasicorp Nomad, Marathon; one Scheduler that truly focuses on making Docker container deployments repeatable and streamlined is Kubernetes.
Kubernetes (k8s for short) awesomesauce lies in the fact that it provides advanced deployment techniques off the shelf. Lots of scenarios are supported with no or minimal hacking. Now yes, you do have to encapsulate your software inside Docker images and setup your Kubernetes objects (Pods, ReplicaSets, Deployments, etc..), but you get so much back for your investment.
Seeing Canary deployments in action is way more exciting than talking about them, so let’s get our hands dirty and dig into an example setup on top of k8s.
So here’s the master plan: First of all, we need a k8s cluster to work with. A Kubernetes cluster is composed of a dozen or so components. In order to bootstrap things quickly, we’ll focus on automation tools for creating the cluster instead of creating it the hard way.
Minikube shall serve you extremely well for local testing. To get a more realistic setup, we’ll be working with Google Container Engine (GKE) which is powered by k8s. It’ll take care of setting up instances on GCP, installing all of the k8s components as well as handle initial configuration.
Deployment to k8s requires that our apps are packaged in container format. For this walkthrough, we’ll be using the Docker runtime. Both versions of the sample app (Stable and Canary) will be packaged up in Docker images with distinct versions as Docker image tags.
We’ll start first with a simple setup for the rollout of a Canary release on k8s then upgrade it to use more tricks.
Phase 0: Prepping the Environment
Here’s what you need to get started with k8s on GCP:
- Select or create a project. This separates all of the resources used so that it’s easier to clean things up later.
- Enable billing for the project.
- Enable APIs for Compute Engine and Container Engine. Those shall allow us to access the resources needed using the gcloud CLI tool.
- If you haven’t already, install the latest version of gcloud. Alternatively, you can use Cloud Shell.
- Finally, install
gcloud components install kubectl.
Next, we can proceed by creating a Container Engine Cluster. The cluster consists of at least one master and multiple worker machines, which will be created for us automagically.
- Grab the Project ID and store it in a variable:
- Set the Project to use:
gcloud config set project $PROJECT_ID.
- Set the default Compute Engine zone, for example:
gcloud config set compute/zone us-east1-d.
- At last, create the Container Cluster, we’re calling it
gcloud container clusters create canary.
If things go according to the plan, you’ll be able to get the k8s components endpoints by executing:
The code/configuration used in the upcoming material is available in a GitHub repository:
Phase 1: The App
First of all, we need an app to deploy! For the sake of this post, we created a simple Go app. Go programs compile into self-sufficient portable binaries, so they are perfect for baking inside a Docker image.
The same approach will work with any other tech stack that can be packaged in a Docker container. Be it a language that needs compilation such as Elixir or even interpreted languages such as Ruby or Python. Just that you need to include the code itself rather than the artifact in the latter case.
The app prints out the version deployed as well as some operational requirements:
Lines 8–10: Stores the version of the application in a constant named
version and prints it out via Go’s http package. Notice that
version is set to
1.0, which shall be the first “stable” version we’ll bake inside the Docker image later.
Lines 14–15,24: Acts as a health check, if a request hits
/health, the app will respond with HTTP 200.
Lines 18–20,25: Exposes an endpoint at
/version which returns the version of the application currently deployed. This will make it easier later to confirm which version of the app is running on k8s.
Line 26: Finally, we listen on port
8080 for any incoming HTTP requests.
Now we can proceed to compile the app by building the binary:
You’ll notice that the
netgo build tag is used. This helps to produce a static binary by substituting system’s libc for Go’s
Moving forward, we’ll need to encapsulate the Go binary into a Docker image that we can later deploy to k8s.
Phase 2: The Docker Image 🐳
Wrapping up the Go artifact in a Docker image is straightforward. It’s just about copying the binary into the image.
You’ll notice that we are using Docker build-time arguments to specify which version of the app to package inside the image.
One of the great things about Container Engine is that it’ll automatically pre-configure the Docker registry credentials on the k8s worker nodes. So k8s will have access to the images in GCR out of the box.
By this point, we have an app that is properly packaged in a Docker image that k8s can consume and deploy. Time to dive headfirst into some Kubernetes.
Phase 3: Deploying Stable to k8s
There are some k8s terms that we need to be familiar with before we dig into the code. First and foremost: Pods.
A Pod is the basic building block of Kubernetes–the smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents a running process on your cluster.
A pod (as in a pod of whales or pea pod) is a group of one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers. A pod’s contents are always co-located and co-scheduled, and run in a shared context
In practice, you mostly care about defining the ‘specs’ of your Pods. Those can include which container image to use, health checks, what command to run inside the Pod, which port to expose, etc. For most workloads, you’ll have multiple instances of the same Pod where the workload will be distributed across them using some form of load balancing, which brings us to the concept of ReplicaSets.
ReplicaSets (RS for short) ensures that a specific number of pod replicas are running at any one time. You simply tell k8s that you need an x number of replicas of some Pod and it’ll make sure that the exact same number is always alive according to the Pod’s lifecycle.
When it comes to writing k8s configuration though, you rarely have to define ReplicaSets but rather use Deployments.
A Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to pods along with a lot of other useful features
You describe a desired state in a Deployment object, and the Deployment controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
Here’s an example of defining a k8s Deployment that uses the Docker image we’ve built earlier:
Lines 1–2: The kind of Kubernetes Object we want to create and which API version to use for creation.
Lines 3–4: Metadata to uniquely identify the Object.
Lines 5–12: Represents the beginning of a Pod Template.
Pod templates are pod specifications which are included in other objects.
Rather than specifying the current desired state of all replicas, pod templates are like cookie cutters. Once a cookie has been cut, the cookie has no relationship to the cutter.
In this example, we are configuring a template to have 3 copies/replicas of the same container, as well as attaching some Labels that can be used to reference the Deployment object. One important label in this setup is
app: kubeapp. By setting the application name to be attached to the Pods, we can then point to the Pods later from a load balancer for instance.
Labels are key/value pairs that are attached to objects, such as pods. Labels can be used to organize and to select subsets of objects.
Lines 13–25: Defines the specs of the containers that k8s will run in Pods as part of the ReplicaSet.
- The containers are based off the Docker image
gcr.io/PROJECT_ID/app:1.0which is stored in GCR. Where
PROJECT_IDshall be evaluated as demonstrated below.
imagePullPolicytells k8s to always pull the image from the Docker registry in case a newer image was pushed with the same tag.
readinessProbedefines a health check to run against the Pod (calling
/healthon port 8080 in this case).
commandsets the command to run in the container once it launches. Here we point to the location of the Go binary at
portsopen port 8080 so that the container can accept and send traffic.
Before we proceed to execute the Deployment, we need to parse
PROJECT_ID, we can use good old
sed for that:
It’s also a good idea to create a k8s Namespace for the Deployment to reside in. Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces.
One example where Namespaces fits very nicely is deployment environments. So you can end up with a Namespace for production and another one for staging in the exact same k8s cluster. Here’s how to create a Namespace:
Now we can finally proceed by rolling out the Deployment 🚀
Notice how we need to specify which Namespace to use. All k8s resources created via the Deployment will reside in the
production Namespace. So in order to access them, we need to always add the
--namespace argument. We can check the progress of the rollout via the
rollout status command:
get command can be used to pull information about all k8s Objects. To get the list of Pods created as a result of the Deployment, we can run:
Looking closely, you’ll find that k8s distributed the Pods across 3 worker Nodes.
Another command that can also be very useful is
get events. This extracts a stream of events from k8s controllers which can be handy for troubleshooting and seeing what happens under the hood.
By this point, 3 copies of the application are up and listening on port
8080 for incoming connections. But we are missing the final component to complete the setup which will allow us to access the app: A load balancer.
k8s has support for creating cloud-native load balancers (LB for short). It does so by creating a load balancer (think GCP LB or AWS ELB) as a Service. In practice, Services can mean different things. The main difference between Pods and Services is that Pods are mortal (so if they respawn, network addressing can change), Services on the other hand, can be identified by attaching Labels to them for later reference with no worries about addressing.
Here’s how to define a load balancer as a Service that points to the Pods created earlier:
Lines 1–4: Specifies that we are creating a k8s Object of type Service, using k8s API version 1 along with metadata.
Line 6: Sets the ServiceType to be a load balancer. This exposes the service externally using the cloud provider’s load balancer. In our case, that will create GCP Load Balancer and automagically create all routing necessary to connect the workloads running inside the Pods to the external world via LB.
Lines 7–11: Represents LB configuration. In this case, LB is listening on port
80 directing the requests to TCP port of
8080 on the backend.
Lines 12–13: Introduces us to yet another important k8s concept: Selectors. They work in conjunction with Labels to identify and group k8s Objects. Earlier, we assigned the label of
app: kubeapp to our Pods. In the LB definition, we can simply use a selector on the very same label so that the LB can auto-discover which Pods it can direct traffic for.
Creating the LB uses the same
apply command that we used earlier for creating the Deployment:
Cloud LBs are created asynchronously, once it is done, you can grab the public IP addresses of the LB using the
We can also extract the IP address of the LB using this one-liner 👉
1.0 (stable) of our app has been deployed 🙌. Two k8s config files and a handful of commands took our Docker image, deployed 3 distributed containers, created a LB and connected things together, not bad at all!
Phase 4: Deploying Canary to k8s 🐤
Now that we have the stable version deployed to the cluster, we can dig into deploying a Canary release. Let’s name it
2.0, we’ll be following a similar flow. The difference between
2.0 will only be the value of the Go constant.
- Build Go binary for
2.0(or download a pre-built binary here).
- Build and push Docker image for
- Rollout the Canary release by applying a slightly different k8s configuration than the one we had for Stable:
Let’s highlight the differences:
Lines 4, 12: Unique metadata to distinguish the Canary Deployment from Stable. Notice how the label named
env is different but both Stable and Canary share the same value of
kubeapp for the
Line 6: Sets the numbers of replicas we want for the Canary release. This is important since it controls the ratio/percentage of how much of our users are going to hit the Canary release. In this case, we have a 3:1 ratio for stable:canary, so 25% of requests hitting the load balancer will get the Canary release.
We don’t need to do any changes to the LB since it already has a selector applied to the label
app: kubeapp which spans across both deployments (Stable and Canary). So Pods launched via the Canary Deployment will auto-join the LB.
It’s time to release the Canary 🐤
We can simulate users hitting the app by executing curl in for loop over the previously set
2.0 is tested and approved, we can proceed by switching the Docker image used in app-production.yml to
2.0 and then simply re-apply the configuration. Another route would be to save the image used directly to the k8s API:
This will instruct k8s to launch new pods while terminating the old ones. By this point, you don’t really need the Canary deployment. You can go ahead and delete it by executing
kubectl --namespace=production delete deployment/kubeapp-canary.
Once you start testing the Canary release, one problem you’ll bump into pretty fast is that the LB might start switching your session between both versions, making it pretty hard to test
2.0 separately. Also, implementing operational requirements like SSL termination or routing rules isn’t really feasible with the LBs that k8s offers.
Enter Ingress resources.
Phase 5: Using Ingress resources to access Canary Deployments
An Ingress is a collection of rules that allow inbound connections to reach the cluster services.
It can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, offer name based virtual hosting, and more.
Ingress resources are provided in k8s by an Ingress Controller. Those controllers abstract away the inner workings of the backend software used for implementing advanced load balancing (could even be nginx, GCE LB, HAProxy, etc) and let us focus on the routing rules.
The Ingress Controller is not usually created automatically in the initial cluster creation, though the cool thing is that they can be deployed as k8s Pods. As for the GKE setup we have in place, it is actually already there, so we can jump right into how to use Ingress resources.
Our goal with Ingress resources would be to split the traffic of the Canary deployment to be accessible on a subdomain. This will make it way easier to test the Canary release separately from the Stable copy.
Let’s assume that the Stable version is accessible at
http://foo.bar and the Canary deployment is at
http://canary.foo.bar. You’ll need to add records for those domains in your local resolver (a.k.a:
/etc/hosts), directing traffic to the public IP of the Ingress resource.
Before you execute the commands in the next section, you might need to delete the resources created earlier. Ditching the Namespace and recreating it could be achieved by this task:
kubectl delete namespace production && kubectl create namespace production
Let’s start by exploring what changes we need to do in the Deployment itself:
The Deployment object is as is, nothing new here. New config lies in the 2nd section (Starting line 27). Here we are basically publishing the Pods created by the Deployment via a k8s Service of type
Kubernetes master will allocate a port from a flag-configured range (default: 30000–32767), and each Node will proxy that port (the same port number on every Node) into your Service.
Our Deployment is exposing port
8080, so the Service points to that port and exposes port
80 on the Nodes themselves. Ingress resources can’t talk to the Deployment object directly, but rather need a Service to publish the Deployment first.
Similarly to what we did previously, here goes the flow for rolling out the Deployment + Service combo:
Querying the k8s API shall show us that Pods are collectively exposed as a Service that listens on an internal IP address. This is also done by the means of a Selector that points to the Label used in the Pods template.
Once the Service is published, we can start creating our Ingress resources. Below we are defining a minimal resource:
As with all other k8s Objects, type is set to
Ingress along with metadata. Ingress resources need to be aware of which backends to route requests to, we only have one backend here pointing to the Service defined earlier.
In this situation we are not listing any routes which mean that this will represent the default backend, all requests shall go to this backend. Once the healthchecks on the backend starts reporting
HEALTHY, we can hit the Ingress address and start testing our setup.
Take note of line 16, we need to wait for the backends to report HEALTHY before we’re able to access the Ingress resource IP address.
The Canary release is very similar to Stable. The only difference would be the number of replicas, the Docker image to use (
2.0 in our case) and related metadata. Rolling the deployment can be done using the usual flow:
For us to be able to access the Canary release on a separate subdomain, we need to add some Ingress routes that point to the Canary Service. The approach we’re using here is name-based virtual hosting, where the Ingress resource will check the host header and judge which service it should send the traffic to.
foo.bar --| |-> kubeapp-production-service 80
| Ingress IP |
canary.foo.bar --| |-> kubeapp-canary-service 80
Lines 6–8: Sets a default backend. This will match any requests that lands on the IP address of the Ingress with no host headers set, directing it to the production Service that containers the production Pods.
Line 9: Starts a section for the Ingress resource rules.
Lines 10–15: Matches the domain
canary.foo.bar and directs the traffic to the k8s Service named
kubeapp-canary-service on port
Lines 16–21: Does the same for the main domain we’re using for our Service
This technique is very useful for testing a Canary release in production environments without affecting the Stable version being consumed by your users. The same Ingress routes can be modified to work on different URL paths on the same domain instead of using vhosts (a.k.a fanout).
Promoting the Canary version to production can be executed using the same flow discussed earlier: Either use
set image or modify the Docker image to use in the production config.
So here you go, putting a Canary deployment setup on top of k8s isn’t that hard after all. Things can even be further developed with Multi-stage Canary deployments and more refined control of traffic weight.
Kubernetes is a gigantic project that allows companies of all shapes and sizes to adopt the infrastructure fabric that used to be only available to large corps (k8s is inspired by borg after all). Best of all, the effort needed to implement similar setups is relatively easy compared to other systems.
This post just scratched the surface of one use case. k8s has something for everyone, dig through the concepts and see where it can fit for you!
Keep ’em flying! 🐤 🚀