Releasing backward-incompatible changes: Kubernetes, Jenkins, Prometheus Operator, Helm and Traefik

First I wrote about the different type of services you can run in a Kubernetes cluster.

Then I showed how to build a workflow driven by Continuous Integration and Continuous Delivery using Consumer Driven Contracts to ensure we don’t introduce breaking changes between both consumers and providers services.

This article will focus on how to introduce backward-incompatible changes to an interface in a safe manner. We leveraged the Jenkins Kubernetes Plugin, Prometheus Operator, Helm Repositories and Traefik to deploy, test, monitor and control different versions of our services in parallel.

Monitoring

We need a way to monitor the different versions of our services. Prometheus is an open-source framework for monitoring and alerting backed by the Cloud Native Computing Foundation.

Prometheus Operator

There is an stable chart on the Kubernetes Charts Repository that let’s you get Prometheus up and running quickly and it’s very useful to test it out but for this demo I wanted to try Prometheus Operator. By creating new Kubernetes ThirdPartyResources It provides easy high level monitoring definitions and management of Prometheus instances.

At the time of writing this, the Prometheus Operator docker image does not include the “alertmanager.monitoring.coreos.com” resource so I compiled the source code from here https://github.com/coreos/prometheus-operator, then built and deployed my own docker image to operate Prometheus on my Kubernetes cluster. After that:

Prometheus TPR

Cool! so now my services can declare that they want to be monitored including something as simple as this on their definition:

apiVersion: "monitoring.coreos.com/v1alpha1"
kind: "ServiceMonitor"
metadata:
name: "example-app"
labels:
app: example-app
spec:
selector:
matchLabels
app: example-app
endpoints:
- port: "web"
interval: 30s

Prometheus Operated Helm Chart

I wanted to reuse this power between all my services so I created a “Prometheus-Operated” Helm Chart. This allows me to create a full monitoring and alerting solution for each service just by including this chart as a dependency when deploying any service with Helm:

dependencies:
- name: prometheus-operated-chart
version: 0.0.1
repository: http://localhost:8879

and specifying the right values for each service at run time in the values.yaml file:

prometheus-operated-chart:
componentMonitored: catalog
  prometheus:
rules:
test.rules: |
ALERT httpRequestsSuspicious
IF http_requests_suspicious > 2
        ALERT remoteCallSlow
IF remote_calls_summary>1

Instrumenting

Before you can monitor your services, you need to add instrumentation to their code via one of the Prometheus client libraries. As my services are written in Ruby I used to https://github.com/prometheus/client_ruby to expose the different kind of metrics.

You can use the rack middleware available to expose a metrics HTTP endpoint to be scraped by a prometheus server (Exporter) and one to trace all HTTP requests (Collector).

use Prometheus::Client::Rack::Collector
use Prometheus::Client::Rack::Exporter

Here you can also see some custom metrics I included for my service to count suspicious requests and to track slow remote calls.

Visibility

With these three components in place your services immediately gets visibility and alerting support right after being deployed.

Prometheus Service Target
Prometheus Service Alerts
Prometheus Service Graphs
Prometheus AlertManager Slack Notification

Releasing

We’ll use Helm Charts, Repositories and Chart dependencies to ensure that the provider services introducing backward-incompatible changes are only consumed by those which support the new version.

Helm Dependencies

The Chart for the consumer service will drive the deployment of the new versions.

Assuming we already have released the services running and communicating on “version 1”, we’ll create now a new Helm Release in a different Namespace with both providers and consumers talking to each other using only “version 2”. We specify the dependencies on the Chart:

dependencies:
- name: prometheus-operated-chart
version: 0.0.1
repository: http://localhost:8879
- name: stock
version: 2.0.0
repository: http://localhost:8878

The steps to manage and deploy the Charts will look something like this:

helm init
helm repo add stock-service-chart http://localhost:8878'
helm repo add prometheus-operated-chart http://localhost:8879'
helm dependency update charts/catalog'
helm upgrade ${RELEASE_NAME} charts/catalog \
--set=IngressDomain=${INGRESS_DOMAIN},prometheus-operated-chart.IngressDomain=${INGRESS_DOMAIN},prometheus-operated-chart.prometheus.alertmanagers.namespace=${NAMESPACE} \
--install --namespace=${NAMESPACE}"

Helm self-contained Repositories

In order to make different versions of the Chart Dependencies available on demand I leveraged Docker to create self-contained Helm Repositories, e.g:

FROM alpine:3.4
RUN apk --update add ca-certificates
RUN mkdir /usr/prometheus-operated-chart
COPY . /usr/prometheus-operated-chart
RUN /usr/prometheus-operated-chart/helm_installer.sh
WORKDIR /usr/prometheus-operated-chart 
RUN helm lint
RUN helm package --save=false .
CMD helm serve --address 0.0.0.0:8879 --repo-path .

Which basically means you can have any specific version of your dependency at any time available locally by just running:

docker run enxebre/prometheus-operated-chart:VERSION

Or using a Kubernetes Pod Template for a Jenkins slave:

containerTemplate(name: 'stock-service-chart', image: 'enxebre/stock-service-chart:v2', ttyEnabled: true)

So far we have deployed two parallel releases controlling which clients consume from which providers with independent operated Prometheus and Alert Managers for each version and each service.

Managing access

Traefik

We are relying on Traefik to expose the external facing services by including Kubernetes Ingress rules as part of the Charts.

Exposing versions in Parallel

In this example “catalog.188.166.173.218.nip.io” will give you access to the services running on “v1” whereas “catalog.v2.188.166.173.218.nip.io” will access services supporting “v2”.

Backend for v1
Backend for v2

The diagram would look like:

Parallel Releases

Coexisting on the same entry-point:

We could now use Traefik to do a Canary Release by telling the ingress rule to pick up both services “v1” and “v2” for the same entry point:

kubectl apply -f merge-ingress.yaml
Backends for v1 and v2 on the same entry-point

The diagram would look something like:

Versions coexist

Decommissioning v1

Eventually we could get rid of all the services running on “v1”:

helm delete qv1
Only backend for v2 on the initial entry-point

The diagram now would be simply:

Decommission “v1”

Automating

Kubernetes Plugin for Jenkins

Of course we want to wrap up this whole process in a Continuous Integrated and Automated fashion. I’m using Jenkins Pipelines and the Kubernetes Plugin

This allows you to run Jobs on dynamic slaves in the form of Kubernetes Pods of your convenience.

You can define your Pipeline in a nice Jenkins file for every service while having a good visualisation of the different stages.

In this example I use the Pod template for making a specific version of my self-contained Helm Chart Repositories available at build time and deploy the right versions of my services.

Jenkinsfile
Jenkins Pipeline Visualisation

Video Demo

This is a quick demo showing the whole process.

Video Demo Deploying Backward-incompatible Changes

Summary

I hope this article gives you some ideas on how to use these tools to address the main points described above (Monitoring, Releasing, Managing external Access and Automation), how to build your own solution for deploying backward-incompatible changes and how to solve similar problems.

Source Code

To create this demo I deployed a Kubernetes Cluster on DigitalOcean using https://github.com/capgemini/kubeform. Then I deployed Jenkins with Helm using the stable Chart.

Consumer service: https://github.com/enxebre/catalog-service/

Provider service: https://github.com/enxebre/stock-service

Prometheus Operated Chart: https://github.com/enxebre/prometheus-operated-chart