A canary :-)

In a recent project I was working on the objective was to setup release pipelines for a canary / phased rollout of an applications micro services. The deployment target for these micro services was a Kubernetes cluster (AKS).

This post assumes familiarity with Kubernetes, Helm and Istio traffic management

This post describes the key requirements, the release strategy selected for those requirements, and the details of how each stage is implemented.

In a subsequent post I will detail how the release stages described in this post map to an Azure DevOps Release pipeline.


The Key requirements

The high level requirement was to do a canary release of the applications services in to production environment.

Key Requirements / Constraints:

  • Each Micro service should be packaged as a separate Helm chart.
  • Different teams manage the different microservices, and each team should be able to release their micro service independent of other micro services
  • The service mesh Istio was installed on the kubernetes cluster
  • For the initial phase of the project only Helm and Istio would be available on the cluster. During this phase use of tool like flagger was not an option
  • It should be possible to for the team to rollout the new version of the application using the same Helm chart in the following phases :

10% of traffic routed to new version

90% of traffic routed to new version

100% of the traffic routed to the new version

  • After each of these stages manual judgement was needed to move to the next release stage
  • Rollback to the previous production version of the application should be possible at each of these stages using the same Helm Chart

The Kubernetes and Istio resources used to release each micro service

Following Kubernetes resources are used for each microservice

  • A production deployment for the micro service. 100 % of the traffic is routed to pods of this deployment during Steady state of operations, which is the state when there is no new release of the application in progress
  • A canary deployment for the micro service. During steady state there will 0 pods of this deployment and no traffic will be routed to pods of this deployment. When a release in progress 10%, 90% and 100% of the traffic will be routed to pods of this deployment during the intermediate stages.
  • A production service corresponding to this micro service
  • An Istio Virtual Service for this micro service which will be used to control the weight of traffic going to the production deployment pods and the canary deployment pods
  • An Istio Destination Rule with subsets for the production and canary deployments
  • An optional Istio Gateway if the service needs to be accessible from outside the cluster

The use of these resources will get clear when we look at the details of Stage 1 in the section below.


A look at the Github Repository

The repository contains the sample code for the micro service, the docker file , the Helm chart, and the Helm commands for each stage of the release. The repository also contains the sample kubernetes resources generated using the helm template command with the Helm Chart.

The sample service used is the istio product page app, and the application code and the docker file has been taken from the istio github repository.

Repository Structure


The key sections of some of the files

The Helm Values file

Let us look at the Helm values file:

service:
name: product-page-svc
port: 9080
productionDeployment:
replicaCount: 2
weight: 100
image:
repository: myacrepo.azurecr.io/bookinfo/productpage
tag: 83
pullPolicy: IfNotPresent
canaryDeployment:
replicaCount: 0
weight: 0
image:
repository: myacrepo.azurecr.io/bookinfo/productpage
tag: 83
pullPolicy: IfNotPresent

The service section contains the service details like the service name. The productionDeployment section configures the number of pods, the traffic weight used for routing traffic to the production pods, the repository name for container image, and the repository tag for the container image. There is a similar canaryDeployment section. The values shown above represent the steady state where 100 % traffic is being routed to the production deployment pods, there are 0 replicas of the canary deployment pods the production and canary deployments are being pointed to the same container image. During a canary release the canary container image will be pointed to the newer or next version of the application.

The container image tagging strategy used in this case is that the tag corresponds to the build id which generates the container image and pushed it to the container registry. So if there is a new version of the application a new build would get triggered and the container image for the next version of the application would have the tag of 84 (83 +1).

We can choose any container image tagging strategy we like.


The Production and Canary Deployment

Production Deployment File

apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: productpage
labels:
app: productpage
canary: "false"
spec:
replicas: 2
selector:
matchLabels:
app: productpage
canary: "false"
template:
metadata:
labels:
app: productpage
canary: "false"

spec:
containers:
- name: productpage
image: "myacrepo.azurecr.io/bookinfo-canary/productpage:83"
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 9080
protocol: TCP
....

Canary Deployment File

apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: productpagecanary
labels:
app: productpage
canary: "true"
chart: productpage-0.1.0
spec:
replicas: 0
selector:
matchLabels:
app: productpage
canary: "true"
template:
metadata:
labels:
app: productpage
canary: "true"

spec:
containers:
- name: productpagecanary
image: "myacrepo.azurecr.io/bookinfo-canary/productpage:83"
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 9080
protocol: TCP
.....

The Key difference between the production deployment file and the canary deployment file is that pods for the production deployment have the canary label value as false, where as the canary deployment pods have the canary label value as true. Another difference is that during steady state the replicaCount for the canary deployment is 0, so there are no pods for the canary deployment during steady state

The Kubernetes Service

apiVersion: v1
kind: Service
metadata:
name: product-page-svc
spec:
ports:
- port: 9080
targetPort: http
protocol: TCP
name: http
selector:
app: productpage
......

The Istio Destination Rule

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
.........
spec:
host: product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local
.......
subsets:
- name: production
labels:
canary: "false"

- name: canary
labels:
canary: "true"

The istio destination rule describes the production and canary subsets. For product subset traffic is routed to pods with the canary label as value false. For the canary subset traffic routed to pods with canary label as true.

The host in the destination rule is the fqdn of the service, formed using the service name the kubernetes namespace

The Istio Virtual Service

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
...
spec:
hosts:
- product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local
gateways:
- product-page
http:
- route:
- destination:
host: product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local
subset: production
port:
number: 9080
weight: 100
- destination:
host: product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local
subset: canary
port:
number: 9080
weight: 0

The virtual service controls the percentage of the traffic which is routed between the production and canary destination rule subsets


Details of the release stages

In the stage diagrams changes in values from previous stage are highlighted in yellow. In the stages below Docker Container Image tag for the current version of the app is 83 and that for the new version of the app is 84

Stage 1: Steady State — 100 % traffic to production deployment which has current version of application. No release in progress, canary deployment has 0 replicas and no traffic routed to canary deployment pods

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=83,productionDeployment.weight=100,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=83,canaryDeployment.replicaCount=0,canaryDeployment.weight=0 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 2 : Release in progress, 2 canary replicas with next version of application (tag 84). 10 % of traffic routed to canary deployment pods, 90 % of the traffic routed to production deployment pods

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=83,productionDeployment.weight=90,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=2,canaryDeployment.weight=10 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 3 : 90% of the traffic routed to canary deployment pods, and 10 % of the traffic routed to production deployment pods

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=83,productionDeployment.weight=10,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=2,canaryDeployment.weight=90 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 4 : 100 % of traffic routed to canary deployment pods

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=83,productionDeployment.weight=0,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=2,canaryDeployment.weight=100 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 5 : Perform rolling update of production deployment pods to the new version of the app(tag 84), while 100 % traffic is being routed to canary deployment pods. This step is needed before traffic can be routed back to the production deployment pods.

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=84,productionDeployment.weight=0,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=2,canaryDeployment.weight=100 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 6 : Switch 100% traffic back to production deployment pods which now have the latest version of the application.

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=84,productionDeployment.weight=100,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=2,canaryDeployment.weight=0 --wait productpage ./productpage/chart/productpage

Stage Overview

Stage 7 : New Steady State, canary deployment replica count modified to 0 again, 100 % traffic handled by production deployment pods which now have the latest app version (container image tag 84)

Helm Command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=84,productionDeployment.weight=100,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=84,canaryDeployment.replicaCount=0,canaryDeployment.weight=0 --wait productpage ./productpage/chart/productpage

Stage Overview

Rolback During any of the stages

We can rollback to the previous version of the app during any of the stages using the following helm command

helm upgrade  --install --namespace bookinfo-k8s-helm-istio-canary --values ./productpage/chart/productpage/values.yaml --set productionDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,productionDeployment.image.tag=83,productionDeployment.weight=100,productionDeployment.replicaCount=2,canaryDeployment.image.repository=myacrepo.azurecr.io/bookinfo-canary/productpage,canaryDeployment.image.tag=83,canaryDeployment.replicaCount=0,canaryDeployment.weight=0 --wait productpage ./productpage/chart/productpage

Testing the application during any of the stages

In this case the product page needs to accessible from outside the cluster, and hence we needed the istio gateway. To access it from your machine we first get the External IP of the istio ingress gate way

kubectl get svc istio-ingressgateway  -n istio-system

Once we have this external IP a quick way to test this is to add a host entry with this external ip into your hosts file

23.XX.YY.ZZ product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local

Now the service can be accessed using the url http://product-page-svc.bookinfo-k8s-helm-istio-canary.svc.cluster.local/productpage?u=normal

In a subsequent post we will see how we can easily create this pipeline in in Azure DevOps with multiple stages , manual judgement and rollback provision.

Microsoft Azure

Any language. Any platform. Our team is focused on making the world more amazing for developers and IT operations communities with the best that Microsoft Azure can provide. If you want to contribute in this journey with us, contact us at medium@microsoft.com

Maninderjit (Mani) Bindra

Written by

Cloud, Containers, K8s, DevOps | CKA | Senior Software Engineer @ Microsoft

Microsoft Azure

Any language. Any platform. Our team is focused on making the world more amazing for developers and IT operations communities with the best that Microsoft Azure can provide. If you want to contribute in this journey with us, contact us at medium@microsoft.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade