Beamery’s journey to CI/CD

Deploy Quickly Everywhere

Tom Galligan
Beamery Hacking Talent
6 min readApr 1, 2021

--

Building enterprise-grade SaaS software is difficult. Customers have high expectations when it comes to quality, and bugs need to be squashed quickly. Maintaining a robust and clean deployment process is paramount in this mission, but is often difficult to achieve in a system with many moving parts. What follows is an explanation of how we handle these issues at Beamery, and what we’ve learnt along the way.

Enigma is the Integrations team at Beamery. We design, build, deploy and maintain the company’s strategic integrations with our partners, which together form the Horizon integrations platform. At the time of writing, our platform comprises 64 serverless functions across 61 repositories, and 10 storage buckets, all replicated over 7 environments. To maintain control over the state of our system, with multiple developers working in parallel, we need a representation of this state that allows for both quick visual validation and easy modification, both during normal development and in hotfix scenarios. It’s also important that there is a clear and singular source of truth; a complex branching strategy in a repository holding this information can create ambiguity, lead to confusion, and ultimately mistakes.

Each time we modify a serverless function and merge our changes into master, a CI pipeline is triggered. This pipeline bundles the code together into a versioned artifact (a zip file), which is ready to be deployed to our cloud infrastructure. To manage the state of our platform, we need a way to represent all of these versions in a central location, so that any changes can be made, tracked and reverted with ease. This representation is done through manifest files in Terraform.

The branching process followed in each of our serverless function repos. Each merge into master and each hotfix commit generates a new release tag. This tag versions an artifact to be referenced in our Terraform repository.

Each environment has its own manifest detailing what version of each function is deployed. These manifests are stored in a git repository, allowing for easy modification when necessary, embracing a GitOps approach to deployment: simply branch out of master, change the versions of any functions you’re working on and raise a merge request for the team. When the code is merged, CI pipelines are triggered and any updates to the state of our infrastructure are made, in whatever environments you’ve specified. Using trunk-based development to control our Terraform repository means multiple developers can work independently minimising cognitive overhead. This reduces the risk of unintended deployments, and keeps the git flow clean and simple to allow for easy examination of its history. The deployment pipelines also include automated testing stages, which help ensure the changes made are safe to go to Production.

The original version of our Terraform repository. Each environment has a manifest containing the version of each function.

One of the core principles that runs through all the work we do is to deploy quickly everywhere. Each of these three words were chosen carefully: by deploy, we mean we can update our infrastructure and code autonomously, without having to hand off to another team. By quickly, we mean that wherever possible, we minimise the time between writing the first line of code and deploying the change. Finally, by everywhere we mean that every change made should be intended to reach Production. This principle shapes our development philosophy, and our investment in it has paid for itself many times over. With this in mind, we looked again at our set of per-environment manifest files, and felt there was a better way to do things.

Enter our new base-overlay model. This solution was built on the idea that our environments should be as close to each other as possible, and only differ when necessary. To accomplish this, we decided to move to one central manifest for all environments. This means when we merge our manifest changes into master, they are deployed through all of our environments, all the way to Production. With our environments in line, the risk of code changes being lost in a complex deployment are reduced to near-zero. Changes can be quickly propagated with minimal effort from engineers, and rollback is easy.

This solution comes with an obvious drawback: for the lower environments to be useful, they must sometimes contain features that aren’t live in Production. How can we allow for that with only one base manifest? The answer to this question comes in two parts.

Feature Flags

Feature flags allow for quick and easy toggling of functionality outside of deployment cycles. By injecting the state of feature flags into your services, you can control their behaviour with simple if-else logic. While this approach is frequently employed for Product benefit (e.g. testing a new feature on a small set of customers to see how they respond), it is also an extremely effective way to roll out changes in a controlled and reversible manner. In essence, it allows you to deploy quickly everywhere. Enigma’s investment in feature flags deserves an article of its own, so I’ll only briefly describe our approach here. We use an in-house API to manage our feature flags, and can quickly add and remove flags where necessary during the rollout of new features. This means we separate deployment from releases, and can roll out new code to all environments with virtually no risk and with instant roll-back available when necessary.

Overlay Manifests

Some changes aren’t suited to feature flags. For example, a hotfix to a Production environment should be in Production as fast as possible without passing through lower environments. Crafting the necessary logic and configuration to do this with feature flags is cognitive strain and wasted time that we can’t afford during an emergency. For this scenario, we need to be able to place hotfix artifacts into Production as simply and quickly as possible. For that, we turn to overlay manifests.

Overlay manifests allow us to temporarily override parts of our base configuration. They contain only versions of the functions we wish to alter, and are most useful for quick checks in our Staging environment and hotfix scenarios. They are designed to be used temporarily, so that changes can be deployed to all environments as fast as possible. By using this approach, we have sufficient flexibility to deal with emergencies, while keeping our system as homogeneous as possible.

The structure of our Terraform repository after we switched to overlay manifests. Moving to overlays encourages homogeneity between environments, reduces the risk of human error and adopts a more declarative approach to configuration.

All of this means that we deploy fast (less than 20 minutes from Staging to Production), frequently (the average engineer deploys their code to Production every three days), and without fear.

Where next?

We’re continuing to refine and develop our deployment strategy. Our main area of focus is to ensure we run comprehensive and robust tests as part of our deployment pipelines. These tests should block deployments if they fail, and will provide us with further safety and protection against bugs.

Deploy quickly everywhere is a central principle here at Beamery Engineering. Aligning your deployment practices with this mantra will create better quality services with fewer bugs, and a more agile response mechanism for problems that do arise. Our whole-hearted investment in this philosophy has enabled us to scale and will guide us as we continue to grow our product and customer base in the years to come.

Want to work alongside some great humans and inspiring leaders to solve big problems? We’re hiring! Click here to join the #BeamTeam and change the future of work.

--

--