How to bring off a gentleman’s release

Dan Kheyfets
hh.ru
Published in
8 min readJul 5, 2022

Releases don’t come out exactly on time? Lots of bugs on regress? Does regress itself take several days? Lots of bugs complaints? Is the release being handled by everyone at once and product development completely halted? Can you relate? Do you feel the same way? I think a lot of people have faced this, including us.

Hey, everybody! My name is Daniel, and I’m QA lead of the mobile department at hh.ru. Today, I will share how we moved away from unstable and irregular releases to a cool and fast mobile app release process. Our releases have started to come out once a week, and the number of complaints has dropped manifold. I’ll also tell you what problems we encountered and how we solved them.

This is a text transcript of our vlog, so if you prefer to watch it rather than read it, welcome to our Youtube channel.

Houston, we have a problem

Let’s start with what problems we’ve had with releases:

  • Huge and unpredictable feature set
  • No exact release date known in advance
  • Long and completely manual regress
  • Unpredictable release date of new features
  • Interface changed too much from release to release

Between long releases, the app changed a lot, both visually and internally, which really made it hard for our users. If three months ago they had one interface and a certain feature set, then in the new release users could get a completely new application, with new features, new interface, but without any of the old and familiar things. Such a drastic change caused nothing but negativity from users.

The solution we wanted

Based on the problems we identified, we decided what we wanted to end up with. We need to make stable, regular and small releases. The more frequent the release, the less functionality goes into it, and therefore updates will be easier for the user. With releases once a week, we can drop new functionality often and in small amounts. Users will not feel like they have a new app every time we update it. Because of the small number of changes, the release itself will be as stable and tested as possible. It will be much easier for the testing team to regress such a release, and the fewer changes, the less chance of a bug hitting the production.

It will also give us a fast time-to-market and allow us to schedule the delivery of this or that functionality to users.

Prerequisites

Three years ago, we had 2 small platform teams: iOS and Android + a few QA, but also we were planning to expand the number of developers and testers. If we had not begun to solve the problems listed above, things could have gotten much worse. As long as there are few of you, and you are all around, you can just ask each other what is going on with the release now, what its status is, and what got into it. But if a number of teams is growing, and they get distributed, life with chaos in releases gets much more troublesome.

Time to solve problems

The problems are clear, the goals are set, and now it’s possible to try to solve them.

To begin with, we identified the main technical problem in development processes — unstable develop branch. In our understanding, “unstable” means that there is untested or even non-working functionality, and the appearance of new functionality can spoil the existing one.

Duty

The first thing we decided to do was just fix it with processes! We will allocate people who will be responsible for the develop stability. They will fix bugs, crashes from the production, create a release and make sure nothing crashes. So we had a duty: for a week or two allocated a few developers and a tester, who only engaged in plugging holes in our endless haze of bugs and crashes. And it seemed like an OK idea, but not really. This approach turned the team off from product development almost completely, the focus on current tasks was lost, and employees were demotivated, as all work was concentrated on fixing other people’s bugs.

Github-flow

Then we decided to figure out why we were always getting something unfinished or broken in the develop branch. We figured it out. We worked on so-called “trunk-based development,” but in our case it was dirty trunk-based. We never tested anything that got into development, but rather tested some intermediate variant, for example when a feature is completely made or when several bugs are fixed.

We didn’t like this approach, so we decided to change the workflow. We switched to github-flow: we decided that each new feature will be done in its own branch. That means that all small problems, into which the feature is decomposed, will merge into this branch. When all the features are done, a tester will take this branch, build the application and test it. And if all goes well, then you can merge the whole thing into the develop. So we got rid of the situation when in develop branch we have something untested or unfinished.

Of course, this approach has its downsides. While development is performed in a separate branch, new functionality can appear in the develop, and the developer periodically synchronizes his branch with the develop branch. Because of this it turns out that the tester has to recheck some functionality, if something affecting it comes from the development. But we’ve put up with it, and to be honest, we don’t struggle with it much.

This approach certainly helped us a lot in stabilizing the development, but it also put a lot of stress on the testers. The whole process was locked in on the tester: until they test and approve the merge of a new feature into the develop branch, nothing could be merged. And in addition to testing the feature itself, we had to make sure that nothing had fallen apart from what had already been done. We realized it was time to save our QAs.

Autotests

Perhaps the easiest way to save the tester from the routine of endless regression checks is autotests. No problem, let’s start writing them.

Thoughts Out Loud

I would like to make a small digression here: you may get the impression that all these process changes and the introduction of autotests took place instantly, quickly and painlessly. Of course not. The transition to github-flow was a long process that was constantly improving, the scope of duty responsibilities is constantly being reviewed and changed. And writing autotests in general is a big, complex and long task. Everything was gradual and steady.

We started exploring approaches and frameworks for writing tests. We decided on XCUITests for iOS, and Kaspresso for Android. To learn more about how to choose a framework for autotests we told in our series of videos on questions about automation.

Along with writing autotests, we immediately started integrating them into our CI. We use build-server Bamboo, which is a tool from Atlassian stack. We use marathon utility to run android tests in parallel, and native solutions from xcode for iOS.

Actually, autotests that are only run locally and are not launched regularly on CI are a very bad practice. Tests will be useful only when any interested person can easily run them and see the results of the run.

When the first autotests were written and the launch was set up, we immediately integrated them into our development process. We set up a test run plan (a plan is a pool of tasks to be done) on Bamboo, where every night we ran tests on all branches of features and develop. Now at the beginning of each work day developers can see how things are going.

Besides, green tests are now a mandatory condition for a new feature to get into develop. After a tester has tested the feature, a developer opens a pull request to develop branchand runs tests on this PR. If all goes well, merge, if not, then please fix it.

This way we greatly relieved the testers, and added another degree of protection for our developer from instability.

Release train, that could

So, we achieved a stable development, but there was still one problem: the unstable release date. Even if we agreed on a certain date, it didn’t mean that the release would be exactly on time… We kept getting requests like “ wait for us”, “well, while there’s regression going on, can we add some more?” and so on. One of the reasons for these requests was the lack of a clear understanding of the next release deadline. If you didn’t get into this release, there was a risk of waiting too long for the next one.

We didn’t like it that way, so we decided to guarantee a date for each release to our colleagues, so that they wouldn’t have the fear of hanging around in the void for too long. We introduced such a practice as “release train”. Furthermore, we agreed that once every two weeks, on a certain day and hour, we would do a release thread and not wait for anyone. If you didn’t make it on time, sorry, wait for the next train. The developer who didn’t make the release didn’t get too upset, because he clearly knew when the next release would be, and that he would definitely make it.

Of course, it was still just a verbal agreement. We tried to be flexible: to go into individual situations and even wait for someone. But on the whole, things were headed for a serious clamping down. At some point we began to strictly follow the release schedule, and the releases began to arrive weekly, without any delays. Eventually, we stopped being asked to wait for some unfinished functionality.

Brief summary

What we had:

  • Large and unstable releases
  • No clear release date
  • Long regress
  • Buffled release content

What practices were introduced:

  • Github-flow
  • Autotests
  • Duty
  • Release train

With this we have tamed the chaos of releases. The releases have become regular and small. The time-to-market has also been significantly reduced.

A lot of people wonder if things have really gotten that good. Mostly, yes. You can go about your life in peace with this approach. We’ve been functioning this way for more than a year now, and we don’t have any problems at all. However, there are a lot of routines in the release process that we would like to get rid of.

All the release routines can be divided into two parts: technical and manager’s

Technical part includes the following:

  • Manually create release branch
  • Upgrade the version there
  • Complete the build
  • Run autotests
  • Drop in a store
  • Notify everyone

During the release, the manager would create a page in Confluence that would have stated who was responsible for the release, the release contents, release dates, test runs, etc. In addition, we had to order texts for the “What’s new” section, and each platform has its own section.

Even though these are not complicated operations, but they take time and require human involvement, so there is always a risk of making a mistake or forgetting something. You don’t have to worry about this, but at hh.ru we like to automate everything. If some routine can be automated in a reasonable amount of time, then we will do it.

In the next article we’ll share a story about how we automated everything and what we got out of it.

Stay tuned!

--

--