Continuous Delivery at Casumo

7 min readJun 6, 2019

At Casumo we want to enable our fellow developers to deliver value easily. Or as easily as possible. To achieve this, we rely on tools that help us automate most aspects of the delivery cycle and help ensure the ongoing quality of the deliverables. Developers can then dedicate more focus time on the function of the code and not on the details of how it is packaged, released and deployed.

I will briefly describe the workflow that we have adopted and the tooling we use to support it, and then dig a bit into the details of each aspect or step in the process.

Workflow high level

We use GitHub as our version control platform. Our code base is split into a repository per service and the basic rule is that we create branches for each feature or change we are going to make to a component. Once we are ready with the changes, we create a pull request. Once the pull request has passed all the required quality gates, the change gets merged to the master branch. Master builds trigger releases. Releases go out to production.

Releases go to production.

One question that is frequently asked is “what if I am not ready to deploy to production?” and the answer to that is “then you do not merge”. If a change is ready to be merged, then it has to be production ready. If a change is not ready for production, then it is not ready for a merge.

Time for a change to reach production varies from service to service depending on the size and complexity case by case but usually, the changes get deployed within 8 to 15 minutes from the moment of the merge.

Let’s have a look at some of the details.

Continuous delivery

Both the feature branches and the master branch changes trigger a delivery pipeline. For the most part, it is the same pipeline except that only changes merged into master will trigger a release and will end up in production.

an example pipeline for one of our services

Delivery pipelines contain numerous distinct steps however most of the time they can be summarized into three main stages:

Build — get the source code and package it into some artifact
Test — quality gates to ensure the artifact does what it is expected to do
Release — make the artifact available for use

Artifacts

Most of the time artifacts we build are those for services that we contain in a docker image. Occasionally it might be a library that is packaged in a platform-specific assembly (jar file for JVM).

Quality gates

Before an artifact is published, released and made usable anywhere, it needs to give some degree of assurance that it does its job. It has to pass certain quality gates. If any of the quality gates fail, the pipeline stops and the build is considered broken.

The most basic quality gate and the one ran during the build step are the unit tests.

If a service exposes or consumes some API, we also run contract tests as part of the build step. For that we use Pact. API consumers run consumer tests and produce pact files and API providers fetch pact files from our Pact Broker and verify contracts against them.

After the artifact is built, it is spun up with its dependencies stubbed and a number of black box tests are executed against it. More on how we approach testing in general and black box testing, contract testing in particular, will come in a separate series of posts so stay tuned.

Another very important quality gate we rely on is our code review process. Before any feature branch can be merged into master, which would start its automated journey to production, it has to be code reviewed and approved by at least one other person.

And lastly, those are just the basic checks that all the services we build should have — developers are free to develop more. Performance, acceptance tests, code style checks, etc and what not.

Artifact versioning

Versioning helps us keep track of what changes were made between each release.

The scheme that we adopted has two distinct paths depending on whether it is a feature snapshot or a release:

Feature snapshots are given their branch name as the version number. Artifacts produced by a feature branch are overwritten on each build of that branch. This is because they are temporary — they may be deployed in test environments for some manual testing, but they cannot make it to production. Once features are merged, they are released as part of the master pipeline.
Releases (master builds) are assigned a fixed version number that can not be overwritten. Start with 1 and increment for each subsequent release. The release process also generates release notes from the pull request description, creates a release tag in GitHub with the release notes and also registers the release with our internal component registry.

Publishing

Deployable artifacts have to be packaged in a docker image, so they have to provide a Dockerfile which is built and pushed to our docker registry. They also have to provide a deployment configuration that is later used by the deployment tooling to deploy it. Deployment configuration specifies some deployment and environment specific configuration values in a yaml file. What’s important here from the perspective of produced artifacts is that the configuration directory is zipped and uploaded to a dedicated place in our artifact repository with a service name and its version. With a deployable artifact and its deployment configuration published, the deployment tooling only needs to know the service name and the version to deploy.

If a service consumes some API, it may also produce pact file artifacts that are uploaded to the Pact Broker.

Library artifacts are packaged in a platform-specific assembly format and uploaded to our artifact repository.

Delivery tooling

We use Jenkins as our CI/CD orchestrator. Build jobs are created and builds are triggered by events from GitHub via webhooks integration. We define jobs for each project as code in Jenkinsfiles and we have encapsulated the internals of each build step behind a builder object injected from an internally developed Jenkins pipeline library. A typical Jenkinsfile for one of our projects might look something like:

The details of how a deployment is made or how deployments are triggered are encapsulated within the library.

Because we are primarily a Java house, some of the common pipeline steps, like making a release, were originally integrated into an internal Gradle plugin — time permitting we should be able to extract them out to an external tool and allow more flexibility in the choice of build tools for internal projects.

Deployments

As mentioned previously, each deployable service defines a minimal deployment configuration in a yaml file. The main components are:

the type of deployment — a value from our supported deployment targets like Kubernetes or ECS
docker image name
environment specific environment variables to inject into the deployed container.

To orchestrate actual deployment, we use a set of Ansible playbooks. The input for a deployment command is the service name, the version to deploy and the target environment. By using the service name and version, the scripts can find and download the published deployment configuration of a service from our artifact repository and work from there.

Monitoring and Observability

Shipping continuously and frequently requires for us to have insights on how the new code behaves compared to the previous version and here comes the need for observability.

Although we do not have this strictly enforced by the tooling (yet!), the services we write expose both operational and business metrics either to Graphite or Prometheus. Deployment configuration gives a choice to either create a default generic Grafana dashboard based on one of the common JVM metrics libraries like Micrometer or Dropwizard Metrics or provide a custom one using a json template that lives with the service’s code and is packaged with deployment configuration.

Deployments also create an annotation in Grafana with the service name and version deployed so that there is a clear line drawn at the time of deployment that can help attribute any anomalies in metrics to a particular release.

All the services write logs in json format and those logs are picked up and aggregated by our ELK stack. We reserve the WARN and ERROR log levels for abnormal situations that should not happen during normal operation and as such we create watchers in Kibana that trigger alerts and send a message to our #operations channel on Slack.

For alerting on service healthcheck failures, alerts on Centreon are created during deployments. Those will both spit the alert messages to the #operations channel and send wake-up alerts to the person on-call.

Last but not least, we instrument our services with an Open Zipkin compatible distributed tracing library that sends trace samples to our Zipkin server and injects correlation ids in our logs. This provides us with a component dependency graph based on real traffic and helps us investigate problematic or otherwise interesting request flows.

Wrap up

What’s really cool about the setup is that the current workflow with the supporting tooling has enabled developers to hop onto the delivery train easily and painlessly — there are very few basic steps that a developer needs to consciously set up and the rest is taken care of by the tooling:

GitHub repo with correct permissions for CI
Release configuration in build.gradle
Tests
Metrics and ping/healthcheck endpoints
Dockerfile
deployment.yml
Jenkinsfile

From there, one can focus on the functionality they are tasked to deliver. Make changes, iterate fast. Internals are open to dig into and improve if there is interest or need, but for the most part, they can be taken for granted.