Continuous development using Repo, Gerrit and Jenkins
Software development isn’t the easiest job, but might be one of the most exciting ones, if done properly.
It also could be stressful, exhausting and disappointing, and the road to nowhere is short.
The importance of moving in the right direction
Both excitement and the disappointment are mainly the result of the direction of moving. Ideally, we should constantly deliver new features and never stop to fix regression, deal with technical debt or change the architecture of the product because of errors and misunderstandings. However, the world is not perfect. It is not unusual to introduce a regression and we have to step back, fix and move forward. Developing a successful product and keeping it in a good health is about managing the right balance and moving in the right direction. Most of the time.
“A chain is only as strong as its weakest link”
The successful process of software development requires coordination between the Developers, QA, Product and Project managers, Release engineers and so on. When one of these departments is not working properly, the whole machinery starts to creak. The disappointment spreads like a plague and it may discourage even the most creative individual. So, how do we prevent that?
Smooth development process
I work for HERE and specificaly, for Datalens, in a dynamic and agile environment. In our team we’ve identified the following goals of a smooth development process:
- Developers should feel comfortable writing new features.
- The risk of regressions should be minimal, even when new people join the team.
- The QA should be able to manually test the changes immediately, even before they go to the master branch.
- Two types of automatic tests have to be run — Unit and End2End.
- They should run on each commit.
- We use multiple repositories and a change may affect different repositories. Tests should address that.
- Tests should run as fast as possible in order to give feedback early.
Let’s drill down and see how we achieved these goals.
Adopting Repo
Repo is an Open Source project, aimed to make the tackle with multiple repositories easier. It integrates well with Gerrit and Jenkins. If your project consists of multiple repositories and the changes you make often affect several of them, it makes sense to take a look into it.
In Repo you work with “Topics”. One topic is usually a feature or a bug on which you are going to work next. When you create a topic with Repo, it will set a branch across all your repositories (or only to a few of them, you decide) and when you commit, the changes from all repositories will go to Gerrit for review in one pass. This also means that you can always easily to download the project to some point in the past, which which otherwise is not a trivial task in case of multiple repositories.
Integrating Repo and Gerrit
Jenkins has a plugin, which makes the work of building the project using Repo easier.
Integrating Gerrit with Jenkins
Gerrit is a web based code review system for Git. Jenkins integrates well with it thanks to this plugin. Once you add the plugin to your Jenkins instance, you will be able to trigger builds on a topic change. We use the topic in order to trigger our End2End tests. Once they pass, the bot votes with +1 or -1 on the change, depending on the result of execution.
On each commit we also trigger the Unit tests. Each repository has its own Unit tests, so when some change occurs we run the Unit tests belonging to the repository where the change was made. Let me illustrate this with a diagram:


Test early, discover regressions fast
Let’s assume you’re working on a new feature or you’re fixing a bug. Of course, your changes may cause some regressions even if there is a really good test coverage and you know the product well, not to mention how easy is to break something if you don’t have enough tests or you’re new to the project. To minimize the risk, in addition to the automatic tests we want to involve the QA in the most important features or when developers are touching the most fragile parts of the project. The most fragile parts of the project you may figure out by analyzing the previous bugs and in which parts of the code they happened.
In order to allow QA to test manually before a change is merged, we established a Jenkins job that does the following:
- On each commit, use Repo to build a version of the whole project up to this point. For us these build archives are relatively small, something like 30Mb.
- Download the build to a QA server, specifically established for this purpose.
- Uncompress them and create a different URL, which represents the “Topic” created by the developer. Developers are supposed to create the topics with the Jira ticket number inside, so it is easy for the QA to follow the links.
- Then, a QA engineer is supposed to test the feature and give +1 or -1 to the developer. If something is not working properly, the developer commits another patch, a new link is created, the QA tests again and the process finishes when QA is OK with the change.
- The code is then merged in the master branch.
We aim to make these iterations quickly, otherwise they may cause merge conflicts.
Another achievement is that the process works well with prototyping for example. There’s no need to merge something in the master branch in order to see it and potentially to break something. Instead, the designer and the developer may create something quickly and see it immediately without disturbing QA with this change. After that, it may be shared with a customer, Product manager, etc.
Four minutes from a commit to a link on the QA server
We tried to make the whole process as fast as possible and currently we have the following numbers:
- Four minutes from the commit of the developer to the link on the QA server.
- Less than 15 mins for building the application and running the E2E tests on it.
Initially these numbers were much larger. When we started to investigate why, we improved many parts, but let me highlight one of them. We discovered that a simple “npm install” was causing significant part of the delay. Not that there is something wrong with NPM itself, installing all the dependent packages from scratch every time was time and resource consuming. The solution?
Cache the installed NPM modules
If you install all required NPM modules from scratch over and over again, you will definitely wait, sometimes for a long time. However, if you cache them and you run NPM again, it will only figure out the differences in package.json (if any) and it will download only what is needed.
When we first tried to cache the NPM modules, we tried to do it with some of the already existing projects for that purpose. We tried local-npm and the results were worse than without it. After scratching our heads for a while, we did something dead simple — we just created an archive from the installed packages and stored it on another server, so it could be retrieved from multiple machines. Then, before we do “npm install” we download the archive and unpack it. We run “npm install && npm prune”, NPM figures out the differences and downloads only the changed packages. Once this process is done, we create an archive and update the existing one on the remote server, ready for the next time. Unless the developer completely replaced the “package.json” file meanwhile, this process proved to be insanely fast and robust.
Is this the end of the story?
Of course not. We are looking forward to more improvements, but the results so far are encouraging.