A journey towards Trunk-Based Development

Felipe Martins
iFood Engineering
Published in
7 min readApr 14, 2021

Pull Requests, code review-based workflows are now the de facto standard for collaborative development in the software industry, period. There are endless talks about how to make better pull requests, how to do effective asynchronous code reviews and most if not all tools seem to have been designed to work primarily with feature branches merged to stable main after approval.

This is so taken for granted now that it can even feel a bit contentious to bring up alternatives to this way of developing software in some circles as illustrated by the thread below:

It's not surprising, then, that the team I'm working with had defaulted to using this approach at first: new features are developed in their own individual branches, which are submitted as pull requests and merged to the main, stable branch after having been approved by someone else via an asynchronous code review. It intends, after all, to bring seemingly good qualities to the development process such as isolation of unfinished work, peer-reviewing quality control and knowledge sharing through PR feedback.

I couldn't help but notice, however, some downsides to this approach in practice such as long lead times¹, large cognitive capacity toll on the team due to frequent context switching (since there're usually more than one task in progress and under review at any moment), fear of deployment to production and error-prone conflicts resolution during merges. (*)

I knew from having worked before in two non-trivial projects using Trunk-Based Development (TBD) and backed up by references such as DORA's State of DevOps report that we could do better than that given the team's conditions whilst keeping the upsides of a PR-based workflow — and by "better" I mean be able to quickly develop new features, adapt to new scenarios and keep the quality of our deliverables: in short, a high-performance team according to the important metrics discovered in DORA's empirical findings. Since our team was comprised of four developers building microsservices in a tech stack that we were already quite familiar with (Java, Quarkus, PostgreSQL, AWS, Gitlab…), we've agreed that there was enough room to experiment with the adoption of TBD for new features and check how it'd fare and what adaptations we'd need to do.

*Nonetheless, these are all common known consequences of adopting PRs and there's plenty of suggestions in the literature about how to mitigate them — my point is that, by the time you've created the conditions to apply them, you might not need to be using PRs anymore. So let's dive into how it went:

What's this Trunk-Based Development thing?

TBD is a high-maturity development workflow in which new commits are frequently integrated into a main branch that must always be ready for deployment to production. Let's dive a bit deeper into each of these highlighted aspects:

  • High-maturity: just like with microservices, adopting trunk-based development is not a one-size-fits-all approach for every context and team. It demands the presence of highly mature development practices such as continuous integration, build pipelines, easy deployments and error recovery, trustable quality control, etc — in short, TBD is far from being an excuse to get sloppy;
  • Frequent integrations: code changes are frequently submitted and made available to the rest of the team — they don't need to go through manual approval processes nor be asynchronously reviewed beforehand. Please notice that this doesn't mean a team member may not work temporarily in a separate branch for a short period of time in some cases as long as the modifications submitted to that branch are frequently synchronised with the rest of the team's work (at least once a day, for example);
  • Main branch: the team should ideally work on a single branch that always represents the current state of the implementation. In general², there shouldn't be a need for the creation of feature branches, hotfix branches, etc;
  • Always ready for deployment to production: commits submitted to the main branch must always be in a "buildable" state, approved by automatic test suites and quality gates, and ready to be put in production. It's not required that every commit must be deployed to production, but it should be possible to do so if needed³.

Keeping up with the PRs

A set of bike repairing tools.
New methodologies, new tools (Source: https://www.flickr.com/photos/bre/552152780)

Each aspect described above may be able to solve the shortcomings associated to the usage of PRs: lead times tend to decrease since there's no asynchronous code review bottleneck anymore; the team's cognitive load is reduced given that team members can now work in a single task until its completion; merge conflicts become less frequent and simpler due to the frequent integration of smaller units of change; and finally, there's usually less fear and anxiety around deployments to production as a side effect of cultivating a codebase that's continuously validated and ready to be deployed (or even more frequently deployed to production).

The challenges arise, however, around finding ways to preserve the benefits associated to PRs:

Isolation of unfinished work

We had to reeducate ourselves more intentionally around decomposing new behaviour into smaller, self-contained chunks to interfere as least as possible with existing code while they're developed — the fact that team had already been developing the habit of using TDD was quite helpful here.

In order to be able to handle complex scenarios where it can be difficult to keep isolation between different work streams, we've brought awareness to the use of techniques such as Branch by Abstraction and Expand-Contract/Parallel Change ,and a broader use o Feature Flags.

Quality control

There was a clear need after some time to adopt formal, Gherkin-like acceptance criteria to describe the team's tasks and, when possible, to automate their compliance via acceptance tests using libraries and frameworks like Cucumber and RestEasy. For cases where automation was not feasible, we've been requiring an additional part to the task's Definition of Done consisting in explicitly showing to the rest of team how the acceptance criteria are being met through other means.

Unit and integration tests (the way these terms are described here) are adopted as part of the development business-as-usual workflow, having their coverage measured as part of the project's build pipeline. Code style patterns, code smells and code design constraints are automatically checked as soon as new code is committed (these checks can also be run locally by the developers) using tools like Checkstyle, SonarQube and ArchUnit. Failures in these code quality checks were not used as gates for the build pipeline at first, but they're enforced now as we've come to tune these checks to an agreed-upon set of criteria among the team. Cultivating a fast, reliable test suite and having an initial set of automated code quality validations allow us to practice continuous refactoring in order to keep the code complexity under control (a habit that's still evolving within the team, we must admit).

A screenshot showing a code example of how to use ArchUnit.
An example of automated code design constraint using ArchUnit (source in the image's link)

Last, but not least, the development/staging environment is always up-to-date with the main branch (each new commit to it triggers a new deployment to that environment, as well as the application of any pending database migration), which gives us more confidence that any validations we perform against that environment before deploying it to production.

Knowledge sharing

We've established that pairings would be done more frequently even if just in the form of checkpoints throughout a task. Please note that that by no means that implies pairing 8h/day nor does it mean doing it in an undisciplined fashion — for more details on how to apply proper pairing techniques, I strongly recommend this text.

In order to give room for even more knowledge sharing, the team has been considering holding regular brown bag sessions so that we can share recommendations on good coding practices, draw attention to interesting code design choices in the project, etc.

In conclusion…

A standard "under construction" sign.
Photo by Mark König on Unsplash

We don't have any formal metrics yet to state that we're unarguably better off than what we'd be compared to the previous PR-based workflow (we do observe a tendency towards shorter lead times, though). To be honest, even if we did, that probably wouldn't prove that adopting TBD was the superior choice for our context since it'd be hardly possible to isolate all the involved factors in this change — software engineering is not a science in the strict Popperian sense, and we'd probably still be programming in Assembly if we had always insisted on requiring that it'd be so.

In a qualitative sense, however, it's quite perceivable and frequently reported by the team how more satisfied they became with the new, leaner development workflow. The strong investment in automating quality controls as well as a more disciplined approach around having formal acceptance criteria gave a lot more autonomy to the team's developers while keeping the tech and product leaderships assured enough that their requirements would be met.

I must stress, finally, that having enough control over our own build pipelines and processes was of paramount importance to make this transition feasible, as well as being the only team that held ownership of this codebase — other teams are still required to submit PRs for code review when they need to change anything in the codebases we're responsible for since they may not be familiar with our team's practices.

Notes

  1. "Lead time" is the time it takes from the moment someone starts working on a task until it gets put into production (there are diverging definitions for this term in the literature, but that's the definitions we're adopting here);
  2. It's usually acceptable to create short-lived branches in more complex teams and projects that are looking towards the benefits of adopting TBD;
  3. That's, in fact, the usual precise definition of Continuous Delivery (CD).

Take a look at the vacancies available in iFood and learn more about the selection process.

--

--

Felipe Martins
iFood Engineering

Posto besteira no Twitter e escrevo software para pagar meus boletos. Ele/Him.