Continuous delivery at Officevibe

Published in

Workleap

9 min readNov 29, 2018

Background

For a long time, here at Officevibe, we used to ship code to production every two weeks, through a release train. Each development team (five in total) would decide into which train they would ship their code. This was done by merging their feature branch into the develop branch — the classic GitFlow.

The code was then merged into a release branch by the release manager, who would then organize a QA session with part of the team. If the QA was successful, the release manager would deploy the code to production and the cycle would repeat two weeks later — rinse, and repeat.

Why Change?

The process I just described above worked well. Over time, we turned it into a well-oiled machine and we would release at a constant pace. You then might ask, why change it? Well, there were different issues we had to deal with that made us realize we needed a change. Let’s have a look.

Quality

First, as the development team grew, the releases became bigger and bigger, always including more changes. With such a high volume of changes being made, we started to see more regression bugs, riskier releases, QA became more demanding, and the whole process became heavier and time consuming. This was slowing us down and we started to notice our release pace had decreased. That was not good.

Scalability

Our release process was already stretched to its maximum. Two weeks is the shortest release cycle we could achieve using this process, and this became problematic for us. A few months ago, we changed the way we develop the product. Teams recently started to switch to a hypothesis-driven development approach. Basically, the idea is to test the success of an idea (a hypothesis) with your users while investing the least possible amount of time. If you can prove the hypothesis to be true, you can then invest more time on it. The reality is that most hypotheses turn out to be false. So, the trick is to be able to test as many hypotheses possible, and to do that, you must be able to ship code faster and more often.

Teams were (almost) completely isolated from production

Another problem we had with our previous release process is that once their code was merged into develop, the only thing that the teams were aware of was that this was going to be deployed into production eventually. Given that the whole release process was handled by a dedicated team, it shielded the rest of the teams from production itself. This also became problematic, since it is difficult to ask developers to design fast, scalable, resilient, and quality code if they don’t even see how it behaves in production, or how the production environment even works.

Production support

Bugs found in production were always handled first by support engineers, who would investigate the bug, and then either fix it themselves or dispatch it to the appropriate team to fix it. This is time-consuming and doesn’t scale well with the development teams growing and shipping more code. We needed a more effective, more scalable way to handle production issues.

Long-lived branches

When starting the development of a feature, teams would create a feature branch from the develop branch. They would keep this branch alive until development had been completed. The problem with those branches is that teams are often spending a lot of time dealing with merge conflicts instead of delivering value. Even worse, solving merge conflicts is error prone and we sometimes had to deal with bugs caused by branch merges.

Solution

We made the decision that we would have teams ship code to production themselves, whenever they want, how often they want. No more release trains, no more release managers, no more QA sessions. We decided to empower our teams to scale our release process. They are now fully autonomous, from ideation to delivery. Autonomous yes, but also accountable. Any team is able to ship to production, but it’s also the team’s responsibility to make sure everything works as expected. Teams now have the responsibility to proactively monitor the production, detect and solve the bugs related to the code they ship.

By doing that, what did we solve? First, we solved the scalability issue. This solution scales well with many development teams since the effort is shared among all teams. We also enabled teams to ship code more often, perfect for hypothesis-driven development. We now aim for smaller releases, which means more focused changes that are less risky and easier to test. This also means that we no longer have to keep long-lived branches alive, because we ship code to production much more often. We’ve also knocked down the wall separating the teams from our production environment; first, by having teams ship code and then second, by having them support what they shipped. In turn, this also solved our production support scalability problem since the load is now spread across the teams. We still have support engineers, but they are less solicited since the development teams now proactively support the code they ship.

So, how did we do it? There were two sets of changes required.

The first one was technical: we needed a fully automated release pipeline, so that teams could easily ship to production and ensure the quality of what they ship. That’s the easiest part.

Second, a cultural change was needed. Continuous delivery is a mindset that a development team must embrace to function well and even if you have the coolest automated release pipeline it doesn’t mean anything if your teams don’t change the way they develop and ship code. They have to see the gains and be part of the change, otherwise you’ll get the same results as before, only with a fancy release pipeline.

Technical changes

Automation. Our previous release pipeline was partially automated. It required at lot of manual interventions, and the people operating it required a lot of accesses to the production environment. That was fine when releases were handled by a team of release managers, but if we were to enable all our teams to release by themselves, we needed a fully automated and easy to use pipeline. Moreover, we didn’t want to give production accesses to everyone, so we made sure only machines (agents) would have access to production. The pipeline has been implemented with Microsoft Azure DevOps. I won’t go into the details just yet, but stay tuned as I will cover this topic in a subsequent blog post!

Quality gates. If we were to release often, we needed to ensure that the teams were releasing quality. We already had a bunch of unit and integration tests, so of course we automatically ran these before each release, but we also added static code analysis, load tests and smoke tests and we keep adding more. We call those quality gates.

Trunk-based development. We felt that keeping the develop branch around would only complicate things in the context of having multiple teams delivering often. In such a context, the develop branch becomes obsolete, as there is no longer a need to stage multiple changes in one branch before releasing. Instead, teams branch off master directly, and whenever they deploy, they merge back into master.

Feature flags. To allow teams to ship code to production as often as possible, we use feature flags, so that they can ship incomplete features. We now differentiate the concepts of shipping code and releasing features. Shipping code no longer means releasing, releasing is now made by switching on a feature flag.

Testing in production. Feature flags also bring another huge benefit: testing in production. We can now easily perform canary releases, allowing teams to test their code directly in production. We always start by enabling a feature for ourselves, internally (dogfooding), so we can catch problems early, without affecting any client. We can also test features with a subset of our users, rolling it out to various cohorts as we gain confidence in the code we ship. If anything goes wrong, at any time, we can simply switch off the flag. Lastly, feature flags, combined with continuous delivery, enable us to perform many experiments in production and allow us to better understand our users.

Cultural changes

As I previously mentioned, this is the most difficult part: making continuous delivery part of the development culture. It’s one thing to give the tools to the teams, it’s another for them to actually use them in the context of continuous delivery. The development mindset has to be changed drastically: shipping code to production that often is not something most developers are used to.

Don’t expect your teams to start shipping code to production multiple times a day right away. They will slowly get there, but first, it’s important that they gain confidence and then progressively adopt the changes involved with continuous delivery.

Tests. The more often you ship, the more confident you need to be with the quality of the code you ship. One way to increase quality and prevent regression bugs is to have a reliable test suite. If your teams are not already writing unit tests, this is the first change they should make. Start writing unit tests, a lot of them. The more you increase the quality and the reliability of your tests, the more confidence you’ll have in the code you ship. Make sure they are fast so you can get feedback quickly. Continuous delivery is a good way to increase quality, because you cannot ship that often without good quality. Tests must become your best allies.

QA. In a continuous delivery context, there is no classic QA phase prior to shipping, since the idea is to ship as often as possible. QA is a responsibility shared among the team. Developers must test their code, with manual testing and by writing unit tests. With the usage of feature flags, teams and or QA analysts can safely test the feature in production once it’s been deployed. Further more, the team can progressively rollout a feature to a subset of users and then observe how this feature behaves without the risk of impacting all your users.

Monitoring. Development teams should take interest into proactively monitoring the production environment so they can take action on issues caused by their code as fast as possible.

Ship smaller increments more often. Walk towards continuous delivery by shrinking your releases and shipping more and more often. A team should start with a pace they are comfortable with and then gradually increase this pace as they get more comfortable, until they reach their ideal pace. Some teams ship daily, some even on every commit! It’s up to them to decide how far they want to go.

Experiments. One of the huge benefits of continuous delivery is how it facilitates experimentation (hypothesis-driven development). With continuous delivery in place, you can start testing your ideas quickly, with your users, in production. Don’t risk investing weeks of development to release a feature only to find out it’s not used at all. Test your hypotheses first, very quickly and then invest more time in those that turned out to be true and discard the others. Once your teams are comfortable with continuous delivery, you may consider looking into hypothesis-driven development.

Takeaways

Automation is key to make continuous delivery work, but it will take you more than a fancy pipeline for it to work. Continuous delivery also requires a cultural change, so make sure your teams embrace it!

You cannot experiment efficiently without continuous delivery.

Feature switches are a must-have for continuous delivery.

Trust your teams, let them own their code, from the moment they write it to the moment they ship it and support it in production. Make them accountable.

Let the development teams own the release pipeline, so that it always fits their needs.

With continuous delivery, you’ll more easily achieve…

A faster time to market: by delivery value to your customers more often.
Increased quality: you cannot ship often if you do not ship quality.
Experiments: test hypotheses, learn quickly and make better product decisions, all that through a continuous delivery cycle.
Happier teams: empowered and more autonomous teams are generally happier, in my experience.