Using CircleCI and Kubernetes to achieve seamless deployments to Google Container Engine

In a past article, I had described in detail, the process of doing Blue/Green Deployments to Google Container Engine (GKE). In that article, I alluded to how we can switch between Blue — Green deployments manually or with a CI/CD tool such as CircleCI, Jenkins etc. (Figure 1 below).

CircleCI is a CI/CD platform that integrates easily with GitHub and has scaled nicely to our build/test/deployment needs. It also has intelligence to analyze the code base and run unit tests automatically. Configuring it to deploy to the Google Cloud Platform (GCP) was extremely smooth.

Circle CI Details: Version 1.0 of CircleCI has been used in this example. It contains a circle.yml file that is located at the root of our application. This file is used to set up the environment, authenticate to GCP, build a Docker image of our Rails application and tag it. Upon the code getting merged into the master branch, “deploy.sh” is invoked to deploy the application to GKE. We’ll see more of what deploy.sh does towards the end of this article.

A sample circle.yml
Note: CircleCI 2.0 has been released and has some exciting new features including faster builds and better configurations. This article will be updated once we’ve fully integrated and tested with 2.0 on GCP

This article talks about seamless deployments to GKE. Let’s look into how that can be achieved.

As a refresher, here’s a sample Blue/Green Deployment in Production (from the previous article).

Blue/Green deployments (Figure 1 below) are a classic way of reducing downtime for our users in Production while updating to a newer version of our applications. This is achieved by having multiple versions of our application running in parallel and switching between them.
Figure 1: Blue/Green Deployments in Production

QA environment: We had not used a Blue/Green deployment model in our QA environment when we started our CI/CD process. Any Pull Request (PR) to GitHub would kick off a CircleCI build-and-test cycle. Once the tests passed, the PR would be merged to our master branch. This in turn would kick off a deployment to our QA server in GKE. Rinse and repeat. All was well. (Figure 2 below)

Figure 2: Sample Deployment in QA with CircleCI
Problem: The above approach worked great as long as the deployments were not too frequent. However, each time we deployed to our QA server it was treated as a full deployment and there would be a lag of up to ~5 minutes when the Kubernetes pods were updated with the new image.

Soon, we had a LOT of features that were being built in parallel and resulted in a large number of PR merges to our master branch. We started seeing longer downtimes in QA. The various teams that depended on our QA server (UX, Product and others) would have to wait for the server “to come back up”. Our automation was working a little too well.

Solution: We decided to use an approach similar to the Blue/Green Deployments that we had in Production. Figure 3 below shows the new approach.

Figure 3

The approach:

Step 1: CircleCI would make a new build, check the Kubernetes service for the current “color” of our deployment and then deploy to the other “color” deployment. For example, if the current deployment was Blue, the new deployment would be Green and vice versa.

Step 2: CircleCI would then update the Kubernetes service to point to this new color (deployment) by changing the service file contents

Step 3: Users would now be able to access the new version

Details

  1. We created 2 Kubernetes Deployments (Blue, Green) representing the current and new versions of our application

2. We had 1 Kubernetes Service (BlueGreen Service) that would switch between these two deployments.

web-service-blue-green.yml pointing to a blue deployment
Note: Unlike Production (Figure 1 above), we did not create a Smoke Test Service and Smoke Test Ingress as we wanted our applications to be deployed quickly to our QA server where the new features would be tested

3. We had 1 Ingress (QA Ingress) that was used to access our application from the internet.

We had to figure out a way to let CircleCI know “what” our current deployment version was and how to switch to the “other’ version.

To solve this problem, we created 2 identical service files (in addition to the main one) representing Blue and Green deployments (color: blue and color: green in the figures below) that would be used for “patching”

blue-service-patch.yml
green-service-patch.yml

In our CircleCI deployment shell script (deploy.sh) we added logic to get the blue/green version from the kubernetes service itself and to deploy our latest version to the “other” deployment and to update the service via the “kubernetes patch” command as shown in the figure below

Discerning between Blue/Green deployments and switching automatically with CircleCI

In conclusion: By using the above approach, we were able to reduce our application version switch time to less than 40 secs in QA while ensuring that the whole build/test/deployment process was fully automated.

Resources:

  1. Martin Fowler’s article on Blue/Green Deployments: https://martinfowler.com/bliki/BlueGreenDeployment.html
  2. Rolling updates: https://tachingchen.com/blog/Kubernetes-Rolling-Update-with-Deployment/
  3. A great CloudNative article on this topic: https://cloudnative.io/docs/blue-green-deployment/
  4. Blue-Green Deployments: https://medium.com/@nithinmallya4/blue-green-deployments-for-a-rails-app-in-google-container-engine-gke-49ddcc1b002