Enabling Fearless Coding in the Trunk

Daniel Yokoyama
10 min readJul 18, 2023

--

A typical branching maze from a GitFlow repository

Para os leitores brasileiros e outros que prefiram o português, tem uma versão traduzida deste artigo aqui.

No. That’s no bait. I mean it.

So, maybe you already do this, and if that’s the case, I hope it could still be helpful. But I’ll talk to those still not enjoying the cozy blanket of Continuous Integration. My team is not (although it is a specific scenario, and I will explore it later, but I still think they should). And if by any chance you think a good way to think of this would be “It depends!”, I’ll take the risk and say: “Yeah, but probably it doesn’t!”.

Let’s face it, why are you not making your changes directly in the trunk (or the mainline branch, whatever you call it)? I’ll assume most of you would answer one of the following:

  • That’s the way I was taught! I didn’t even think about that ever since.
  • Are you crazy? People would be mad if I broke something.
  • I’m aware of where this is going, and I read that it would be the case to work in branches when the code is maintained by a big team, which is my scenario.

If your reasons are different from those, I would like to know. Consider spending a minute to leave a comment.

The third case is probably the best for the “It depends!” scenario, although the definition of what would be a “big team” is relative and very questionable. But I don’t mean to pull any legs, so let’s get going. It is a hot topic surrounded by misconceptions and misguidance, despite it has been around for decades.

(Re)Discovering Continuous Integration (CI)

Oh, come on, D! You can’t be serious. I already use _____ for years.

- Maybe you (filling in the blank with Jenkins, Github Actions, TravisCI, CircleCI, or any other tool like those).

If you’re already using some build-automation tool to make some validations as new code is pushed to the source control, that’s great. Trust me, I still hear from teams that are not. But still, even though your build pipeline is put together, that’s not enough to say that you’re using Continuous Integration.

Continuous Integration refers to the practice of frequently integrating code changes into a shared repository. It involves automating the process of merging, validating, and building code changes, ensuring that the integration is seamless and reliable. CI empowers development teams to work concurrently, collaborate effectively, and catch integration issues early in the development cycle.

If you’re not confident that your pipeline is doing a good job asserting that everything is just fine, and you can just go ahead and put it into production, you’re not benefiting from CI yet. And if it is safe to assume it does, it should allow you to code in the trunk without fear.

“And why does that matter?” you might ask. Do you remember the last post? One of the Four Key Metrics from Accelerate is Lead time, that is, the time it takes for a code change to be implemented and deployed to production. The time it takes to implement the change is totally up to the software engineers, but once the task is done, the longer it takes to get into production, sitting in a queue, awaiting to be deployed, the bigger it makes the Lead Time to be. So this indicator gives us some clue of how much time accumulating digital dust a commit should wait before being deployed: Hours? Days? Maybe weeks… (I’ve seen scenarios where a single deployment could take up to six months to be done, with hundreds of commits).

And that’s a centric concern of DevOps: make code get to production smoothly and fastly. Hence the importance of the Lead Time indicator, and how crucial it is to make it depends heavily on the time it takes for the code to be implemented, and make the time it takes to be deployed lesser and lesser, as much as we can.

But, if making it fast is one of the biggest concerns, there’s a second one, as important as the first, which is making it reliable.

So, how can we make our build pipeline reliable enough to give us the benefit of CI?

Use Version Control

You probably use git. I know there are others (in my career I have used CVS, Visual Source Safe, Subversion, and even Perforce for a short time), but for quite some time Git has been well-established as a standard for programmers in a general sense. But using a Source Control Management tool is just the means to an end, and dealing with Version Control adds a new layer to how you use it.

For instance, most developers just use SCM to commit their application code, but ignore that they should be versioning together the code for system configuration, application configuration, and scripts for automating build and configuration for the whole application. So if your application is dependent upon a database, you should implement some automation to keep your database aligned with your code version (handling schema migration, required data to be available when the application runs, etc). Also, the Application Configuration should be equally versioned. Taking the case of database dependency, there should be easy to configure the database credentials and inject them into the application in runtime so the same version could be deployed in both productive and non-productive environments (for testing, QA, or staging concerns).

Accelerate describes this survey that was performed for Puppet’s “The State of DevOps” report, and they have found out that:

What was most interesting was that keeping system and application configuration in version control was more highly correlated with software delivery performance than keeping application code in version control. Configuration is normally considered a secondary concern to application code in configuration management, but our research shows that this is a misconception.

So, using Version Control, and managing System and Application configuration as part of your codebase, as well as automation scripts, not only is a necessary step towards Continuous Integration but also improves the team’s performance and make the pipeline more reliable.

Test Automation

By this time it may not be a surprise to you that I’m talking about stuff that has been discussed for a long time: CI, Version Control, and now Test Automation. You probably have heard of them, and you probably know what TDD is… maybe you have even read the eXtreme Programming book, attended some talk about it at a conference, or heard about it in some blog post or Youtube video. And maybe you have also seen other stuff rejecting it (sometimes, completely).

I am a big fan of TDD, but regardless of what you might think about it as a design tool, the fact is that to increase the build pipeline reliability you have to make use of test automation. And, for the sake of my sanity, I have to take this chance to give some advice: consider using it by doing TDD (I promise I won’t talk about it again).

Yes, Test Automation is a must. If you were considering stopping reading when I mentioned versioning system configuration because you thought that would be hard, you’re probably regretting continuing it now. Testing is hard. Especially when doing it lately in the development workflow (I’ll keep my promise). Even more, because we’re not only talking about tests that make us sure that our code is working well, we’re also talking about making the build pipeline reliable, and now we have configuration code being versioned with the application code and automation scripts being put together as well. And everything should be well-integrated (Continuous Integration, remember?).

Storytime:

In 2013, I was working on a project using .Net, and we chose to use Entity Framework’s Migration feature to handle the versioning of the Database’s Schema. One day, when we were trying to build a new version of the application and the pipeline attempted to generate the script for the database changes, we got a migration error and we had to postpone the deployment and investigate the issue.

We found out Entity Framework’s Migration feature relied upon a Timestamp to make sure all migrations were being kept in order, but then, people working on different tasks (using different branches) created different changes to the database, each one with different timestamps, but the way they merged the code afterward were not in the same order, and the entries were not following the right chronological order, making Entity Framework to panic. So, from that moment on, we decided that we should always check the migrations order when merging code to the trunk (we used GitFlow — I know, you don’t have to remind me of that).

But that was not enough. The whole point is to make the build pipeline reliable, and we can’t rely on people recalling to check the timeline of migrations to push their code. To be reliable, the build pipeline has to double-check everything. It is the ultimate guard preventing human errors to cause issues to the delivery stream.

So we started to create a new kind of automated test. People call it Infrastructure Tests (to distinguish them from others like unit tests, or integration tests). And the first test case was to create a brand new database from scratch using the codebase (which would make it compatible with the latest version of the code), then run every single migration downwards to the very first database version (testing the downgrade steps), and finally running all of them again, now upwards to the last one (testing the upgrade steps) and checking if everything was done ok.

We had to spend some time figuring out how to hack Entity Framework to work nicely with SQLite in-memory database to make it not take too long to run all the tests. So the build wouldn’t be affected by the time it would take to perform them.

That’s how valuable it is to make the build pipeline reliable. Every time it would run, the pipeline would assure us that any issues regarding the migrations’ timeline would be detected, so we could just perform our work and rely on it.

Make sure your pipeline tests everything that could compromise the Integration of your application. Don’t neglect the value that it brings to your delivery stream. Consider not only creating Unit tests and having a nice code coverage from them, but also making integration tests, infrastructure tests, and any other tests that would increase the pipeline reliability.

Build Quality in, and Shift Left on Security

So, the application is being compiled, the configuration and automation scripts are being committed together with the application code, everything is being well tested and you have nice code coverage… look at you! Good job. What now?

Now we have to avoid other bottlenecks that could affect our Lead Time, hence, sabotaging the reliability of the pipeline. First, we have the check whether the code is compliant with any quality standards we have to follow. Maybe you and your team like to run some linter to check code standards, or to extract some code metrics reports to check cyclomatic complexity or other indicators. Maybe this is not a “No Go” issue to break the pipeline, but it would anticipate feedback on later inspections and reviews that could make the changes wait before being deployed.

The same applies to security. Maybe the company’s Infosec team has established security compliance that everyone has to follow. You should make your pipeline to check those inspections too. Maybe, you need to run vulnerability scans to check your code, the application dependencies, and its configuration… the more you bring into your pipeline, the more reliable it will be.

Work in small batches and push code regularly

Now your pipeline is doing everything it needs to make sure your whole application is well integrated, and you can rely on it.

The best thing you should do now to truly benefit from Continuous Integration is to avoid your code spending too long before being pushed into the trunk. And the best way to make this is to just code in the trunk. To avoid long-living branches and just code in the trunk you have to avoid making your work based on big batches of code being pushed all at once, and work in smaller, valuable, deliverable steps, pushing small batches of code regularly into the pipeline and watching your delivery stream as it runs to perform the integration iteratively and incrementally.

Also, you should never forget that things evolve, and there’s no such thing as “being done”, like an Ultimate Pipeline, The one pipeline to rule them all. You’ll probably face situations where the reliability of your build pipeline would be challenged, and find room for improvements. Take care of it. Don’t neglect it! It’ll help you sleep better.

Additional Advice

Some people (me included) like to keep a smaller version of the pipeline to run while pushing code to the origin repository as a way to make sure at the last moment that everything is fine before the code makes it there and breaks the trunk on GitHub (or similar). That’s useful, although you’ll probably have to be careful about how short it is compared to the original one.

Another option would be to run the whole thing locally, in your development environment (although that is not always possible, but if it is, it is also great).

Many people tend to be annoyed if the building makes them spend some time waiting. Especially when they have to run it on a before-push basis. Many think they should be spending this time coding more stuff. And it is funny to figure how relative it is to say it takes too long, as it makes sense if the whole thing takes more than 10 minutes to finish, but you’ll find people that would probably complain even if takes only 2.

Conclusion

I know… that was quite long… a lot.

But we have covered a lot of ground in one of the best practices we have to reduce Lead Time, which is one of the Four Key Metrics, Continuous Integration. We talked about how coding in the trunk is not a nonsense way of doing work, and through the pipeline reliability, we could be confident to do it without hesitation.

In the next post, I will talk about Continuous Delivery, which is another great practice to use for that matter.

I hope you find this post helpful, and I’m eager to see what y’all think about it.

Thank you.

--

--