Continuous Integration at FreshBooks
Before our new pipeline we were running on an older version of Jenkins, using an old unintuitive UI, executing ~1500 E2E (full browser) tests in 1 to 1.25 hrs (variability due to unstable tests). After updating Jenkins, making some stability improvements, and some adjustments to our instance sizes we were able to bring our pipeline down to 13–20 mins while adding 400 more E2E tests. Gotta go fast!
At FreshBooks we build highly interactive applications, with synchronous requests, asynchronous requests, 3rd party dependent requests, all backed by an micro-(ish)-service architecture.
We try to break up applications based on business domains so we can limit the size of changes when shipping new versions of our applications. At present we have a little over a dozen services explicitly impacting the core functioning of our product, with most of them shipping multiple times a day.
On the front end we have several consumers: our iOS app, our Android app, our Chrome Extension, our FreshBooks Classic app, but our main front end is our Single Page Application for New FreshBooks. The vast majority of our feature work goes into developing this platform, but since we’re developing at such a frequent pace, it is hard to manage stability between several contributors merging to a single repository; all the web developers at FreshBooks are Full Stack Devs.
To solve this problem, we have a Continuous Integration (CI) pipeline that runs on all of our applications. There are some minor variations depending on the service’s scope of concerns, but essentially they have the following stages:
The CI Pipeline
1. Lint rules review
2. Containerization & Packaging (Docker Image)
3. Dependencies Security Audit
4. Unit/API IntegrationTests
5. E2E Testing (full browser interaction tests using Chrome headless)
6. Documentation & Deployment
The critical path of the pipeline is:
Packaging (0.2–3 mins) ➡ E2E Tests (10–12 mins) ➡ Deployment (2–5 mins)
Lint Review
The lint bot is actually super helpful because sometimes my IDE of choice silently fails loading the lint checker, and I end up committing some ugly stuff. The bot very politely and succinctly notifies me of all my failings. This allows us to maintain a very clean and consistent looking codebase while freeing reviewers to focus on actually understanding the introduced changes.
We were previously using a bespoke application to do this for us, but we’re going to be moving towards using stickler-ci, or black for our python apps.
Containerization & Packaging
Since we are building our image from scratch with the latest dependencies, our packaging can be quite time-consuming. Our backend applications are built on either Ubuntu or Alpine Linux, and have some extra libraries installed to enable native extensions of our app dependencies (like the python mysqlclient
library). After setting up the OS environment, we install our application dependencies (i.e. pip install
) as a separate Docker build step. We do this before mounting the application code because Docker caches build steps for more efficient builds. This is a Docker best practice because the dependencies are less likely to change than the source code, so by installing the requirements with a cached layer we can save some bandwidth and more importantly time! Once the Docker image is built, it is tagged with the git SHA and uploaded to our container repository to be consumed by later stages of the build.
Since we use dynamic languages there isn’t any need to compile code; the application installation stage takes seconds. It’s actually the installation of dependencies that takes the lion’s share of our build time. We try to limit this impact by caching docker images on our build nodes so when Docker builds the a new container, and there have been no changes to the dependencies, it will simply reuse the cached container layer to build the new image. That is why this stage can vary between 10 seconds and 3 minutes.
Unit/API Integration Tests
Unit Testing is great, because it runs quickly and tells you why it failed specifically (usually). We use a combination of unit tests and API Integration tests with a bias towards integration. Having a comprehensive coverage of our API surface allows us to be more confident in spitting services away from our legacy monolithic application, since the primary mode of consumption of these services are via RESTful HTTP endpoints. These tests are also useful for documenting the behaviours expected of the service being updated, allowing the PR reviewers to verify and comment on the scenarios being captured.
We almost exclusively set up our testing environment using docker-compose to bring up the application and some dependent services, like a database server. We use test runners like pytest and minitest to run our test suites, the services themselves have some leeway here. What is important is that they generate JUnit style coverage reports, so it can be consumed by Jenkins for reporting purposes. The test runs can either be parallelized or not depending on the running time; if unit tests become the critical path we increase parallelization. We try to keep individual tests cases in the millisecond range, with the full unit test suite taking an average of 5 minutes.
Dependency Audit
Dependencies Audit stage is pretty straight-forward. We have services written in python, ruby, and javascript. Each of those services have several dependencies on external packages and libraries. In order to make sure we are running safe and current versions of a library, we run a dependency audit for each of the dependencies to ensure we’re safe. For Python we run the “safety” tool, for JavaScript we use the “retire.js” tool, and for Ruby we use the “audit” tool. In this check, even a single vulnerable dependency fails the build. This way in order to get their change deployed, the Developer is required to update the vulnerable dependency. This stage runs in less than 15 seconds.
End-to-End Testing
The E2E Testing stage is our most demanding test suite. This is because it essentially requires a miniature recreation of our entire service network as a docker compose stack. You can read more about Docker Compose on their website. We use the .env
file for managing which versions of a given service to bring up. All the services are set to pull their current production versions by default, except for the service we want to test. For the service we want to test we set the version to match the container we built in the Build stage. Once the service is set up and talking to the rest of the network, we can start running our tests.
Our tests are full browser tests; which means we load up a browser and do full interaction flows using Cucumber/Capybara with ChromeDriver. We use Chrome Headless for our browser, and we manage the parallelization using a custom Groovy script to push tests into 65 separate browser instances. We determined the correct number of instances by doing some experiments on the number of treads that our test node could handle; there was an inflection point where more parallel treads caused a slow down in the execution of the scenarios themselves.
In the end we are able to get through 2000~ E2E scenarios in about 15 minutes. We then further split this stage into two so that running the full scenario set takes less than 10 minutes. This stage could drag on longer if there are failing tests, since we attempt retries on failed scenarios in case of instability. So we can say the stage runs in 10–12 mins.
This is a very comprehensive way to do service integration testing so we can have high confidence in our services interacting correctly with each other when deploying any service. Another approach would be use fixtures instead of running services, and it is something we are looking into implementing. This is the approach we started with because we only knew the behaviour we wanted defined as we split services out of our legacy monolithic application. Now that we having stabilized the new architecture, we are always looking at opportunities to make this better. If you have any ideas, drop a comment below…Or tell us in person by applying to FreshBooks!
Documentation & Deployment
Finally, after all the stages pass, we have different behaviours between PR, master branches.
For a PR branch, this is the end of the journey; we notify the author via slack that their PR has passed/failed the tests along with direct links to the PR, the Jenkins CI run, and the Cucumber report for the E2E tests.
For the master branch we create a Wiki page with all the tickets that the release includes, and the application version number for that release. We use CalVer so the version bumping can also be completely automated.
Then we deploy the new version of the application to our staging environment running on GKE (Kubernetes). We manage our deployments using Rundeck so there aren’t any race conditions between builds trying to deploy at the same time. Our current workflow starts with triggering Rundeck to execute an Ansible script for deploying Kubernetes resources. This script is responsible for rolling out new versions of the application to our Kubernetes cluster. The deployment script also has pre and posts steps for things like Database migrations and spinning up asynchronous workers.
Once the application has been fully deployed, we notify all the authors that have changes in the new release. There can be multiple authors because we might have had a broken build, in which case deploying a new version with a passing build will include changes from two (or more) authors.
Closing
So that’s the gist of it! We’re currently in the process of exploring Continuous Deployment options, but this is generally how things are done at FreshBooks today. In future articles we will share with you some of the pieces we use to make this work smoothly. We will also be sharing our journey to Continuous Deployment once we have some more pieces in play.
Also, shoutouts to our Ops and DevOps teams (present and past)!
If you enjoyed reading this, and are generally fascinated by all things DevOps or Web Technologies in general, have a look at our openings! We are hiring talented folks who are keen to work in a high performance dev organization, and want to make an impact.
Appendix
- The Pipeline UI screenshots are from the Jenkins plugin called Blue Ocean
- We store our database migrations in the repo of the app that owns it. On app startup all the database migrations are run & verified using Liquibase.
- Link to a sample Jenkinsfile that shows all the stages: https://gist.github.com/kamikaz1k/7fe25b604c0edac552e6a71215d41ba9