Canary deployment using Argo Rollouts and Nginx

Luigi Toziani
8 min readOct 4, 2022

--

What is Argo Rollouts

Argo rollouts is basically a software for Kubernetes that helps to adopt state of the art deployments practices, like canary or blue green, adding an extra layer of control over the entire process.

For example, If we were to implement the canary deployment by ourselves, we should consider to take care of a lot of manual work that needs to be automated somehow. This is a list of things that first come into my mind:

  • Edit the ingress resource to manage the traffic percentage when you start and each time you want to increment the percentage till to reach 100%
  • Add custom annotations to correctly distinguish between canary and actual production resources
  • Check canary service metrics like latencies or error rate to proceed or rollback
  • Eventually handle rollback, manually changing annotations on the involved resources

Well, Argo Rollouts helps us to define the deployment with a declarative approach where basically we can define all the points above and many others I’ve omitted in a specific manifest that can be even eventually tracked in a repository to follow GitOps practices.

Furthermore, I report below an answer taken from the official FAQ that clearly says there’s no dependency with Argo CD in case we only want to introduce rollouts.

Argo Rollouts, is an independent project and you don’t need necessary to adopt Argo CD Argo Rollouts is a standalone project. Even though it works great with Argo CD and other Argo projects, it can be used on its own for Progressive Delivery scenarios. More specifically, Argo Rollouts does NOT require that you also have installed Argo CD on the same cluster.

Demo Requirements

Below are listed the software requirements to run this demo

  1. minikube installed (see here for official minikube documentation)
  2. Enable Nginx ingress controller on minikube
  3. Argo Rollouts controller installed in the cluster
  4. Argo CLI plugin to control the deployment process

1. Minikube

First of all we have to start a new cluster using minikube. We will use version 1.21 in order to be compatible with the nginx ingress Kubernetes manifest used for this demo.

minikube start --kubernetes-version=v1.21.0

2. Nginx ingress controller

We need to enable the Ingress controller on our minikube cluster simply launching this command:

minikube addons enable ingress

Now we can configure the machine hosts file with a custom hostname mapped to the minikube cluster ip (x.x.x.x) to simulate the service name resolution:

#sudo vi /etc/hosts and add the line belowX.X.X.X  rollouts-demo.local

To find the cluster ip you can simply run the following:

$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.49.2:8443

3. Argo rollouts controller

Once the cluster is up we can setup the argo rollouts controller as described in the the official documentation here. Basically we need to define custom Argo Rollouts resources that will be used later to describe the deployment process.

kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

The result of the above command in terms of created resources can be inspected using the following command

kubectl get all -n argo-rollouts

4. Argo CLI plugin

The plugin is considered optional by documentation (here) but it will be used for this demo basically because it provides two ways to control the deployment process, via CLI or exposing a Web GUI.

This is the script to download the plugin for linux but you eventually explore the official doc for the full instruction

curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64

Once downloaded, let’s make configure it for the shell

chmod +x ./kubectl-argo-rollouts-linux-amd64sudo mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts

Canary demo

Ok we’re now ready to show Argo Rollouts in action! We need to create the proper Kubernetes resources to configure our canary deployment.

Define the deployment declaratively

We’re going to define our deployment process as a Kubernetes manifest custom resource:

kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/rollout.yaml

I report below the content of the rollout manifest we’ve just applied to better describe the deployment that we want to implement by temporary splitting 5% of the traffic to the canary service:

apiVersion: argoproj.io/v1alpha1
kind: Rollout <--------------------- 1. Custom Argo Rollout resource
metadata:
name: rollouts-demo
spec:
replicas: 1
strategy:
canary: <--------------------- 2. Type of deployment
canaryService: rollouts-demo-canary
stableService: rollouts-demo-stable
trafficRouting:
nginx: <--------------------- 3. Ingress definition
stableIngress: rollouts-demo-stable
steps:
- setWeight: 5 <-- 4. 5% of the traffic to canary service
- pause: {}
revisionHistoryLimit: 2
selector:
matchLabels:
app: rollouts-demo
template:
metadata:
labels:
app: rollouts-demo
spec:
containers:
- name: rollouts-demo
image: argoproj/rollouts-demo:blue <--5. stable image
ports:
- name: http
containerPort: 8080
protocol: TCP
resources:
requests:
memory: 32Mi
cpu: 5m

Let’s continue by adding two services and the ingress resources

kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/services.yamlkubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/ingress.yaml

The result of the commands should be equal to this (except for the IPs):

$ kubectl get rollout
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE
rollouts-demo 1 1 1 1
$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rollouts-demo-canary ClusterIP 10.96.6.241 <none> 80/TCP 33s
rollouts-demo-stable ClusterIP 10.102.229.83 <none> 80/TCP 33s
$ kubectl get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
rollouts-demo-stable <none> rollouts-demo.local 192.168.64.2 80 36s
rollouts-demo-rollouts-demo-stable-canary <none> rollouts-demo.local 192.168.64.2 80 35s

Please note that Nginx needs a second ingress resource to implement canary traffic splitting. This is automatically managed by Rollout Controller that creates a cloned ingress resource with the following pattern <ROLLOUT-NAME>-<INGRESS-NAME>-canary. The generated configuration is going to be different for other load balancers like, for example, AWS ALB so it’s important to keep this in mind and check the documentation for other specific cases.

Open your browser and connect through our ingress to the application we defined in our rollout manifest. Remember that the hostname below is specified in the Nginx ingress resource:

http://rollouts-demo.local

Once connected, the Argo Rollout demo application simulates live traffic (100% of the traffic is flowing through service rollouts-demo-stable)

Start the rollout

Now everything is ready to execute a canary rollout! Argo Rollouts provides two ways to go, both via the argo rollout plugin:

  • CLI
  • Web GUI

For this demo we will use the web GUI by simply running the following command that will output the location of the Argo Rollouts Dashboard:

$ kubectl-argo-rollouts dashboardINFO[0000] Argo Rollouts Dashboard is now available at http://localhost:3100/rollouts

Once connected, the application will show a box for each rollout manifest eventually defined in a specific namespace. For this demo we only see one box in the default namespace (you can filter namespace using the textfield on the top right).

We can click on the rollout box to see the rollout details we’ve defined inside the rollout manifest file and then update the service in the containers section from argoproj/rollouts-demo:blue to the new version of the applicationargoproj/rollouts-demo:yellow
Once we confirm, the rollout will start and after some seconds the rollout is paused as specified in the rollout steps inside the rollout manifest (See the box with states in the picture with a red box). In this moment we have 5% of the traffic flowing through the canary service and the rest flowing on the original stable one.

If we go back to the application inside the browser we notice the 5% of the traffic visible as a green area in the bar.

To better understand what is going on under the hood we can run the following argo rollout plugin command and figure out that different replica set and pod are involved for the canary service receiving 5% of the traffic:

kubectl-argo-rollouts get rollout rollouts-demo

Finish the deployment

At this point, we have two options:

  • proceed to move 100% of the traffic on the new service
  • rollback the deployment to the previous service (blue)

To proceed we can click on the promote or promote-full button in order to complete the deployment, moving 100% of the traffic to the new yellow service. In this case the result would be the same for both buttons but the first one could be useful in case the steps in the deployment has multiple stages, like a middle canary stage with 50% weight. The promote-full would instead skip every middle stage to reach 100% on the new service directly.

Going back to the application in the browser we can have a visual representation that 100% of the traffic is now on the yellow service.

We can even run again the CLI plugin command to see the resource state and figure out that the old pods is automatically destroyed and the new replica set was marked as stable.

Conclusion

In this demo we have seen how Argo Rollouts can be adopted to introduce well defined procedures for a deployment process. A manual canary deployment can turn into a well defined resource that lives in a repository and can be configured in the way we expect it to work with a clear set of steps that reduce human error and improve the whole reliability of the process.

Furthermore, another great feature of Argo Rollouts that we omitted for the sake of simplicity, is the progressive delivery implementation. Simply put, it allows to introduce a full automatic check based on some specific app metrics in order to auomatically promote or rollback the deployment to the previous state (e.g. latency thresholds, error rate thresholds and so on).

Clearly, this is just a demo to get an idea but introducing this kind of automations in an established production enviroment can become challenging so it is worth reading the migration section of the official documentation for related caveats.

--

--