Speeding up your CI Pipeline by introducing Incremental Analysis on SonarQube

Jesús Pajares
BestSecret Tech
Published in
4 min readJun 21, 2021

Limited agents to execute builds, endless analysis time, queued jobs waiting for a slot to run… and very large projects waiting to be tested and analyzed. All this can make your waits quite tedious and your day unproductive, especially if your CI pipeline takes very long to fail.

We have seen that happening especially for SonarQube analysis. For a particularly large project (~5000 classes), our Sonar analysis has typically taken between 20 and 30 minutes when executed for the main branch or for Pull Requests. This has been even worse for feature branches, which are compared with an analysis of the main branch. This means, that for this type of analysis, times were duplicated, exceeding some times the hour.

As you might be thinking, that is inconvenient in many forms. That’s why we came up recently with the idea of introducing incremental analysis.

Two Jenkins pipelines performing a Sonar Analysis on feature branches compared with the main one. The first one without incremental analysis needs 49 minutes, while the second one only needs 15.

But let’s contextualize a bit before we dig deeper. Incremental analysis consists of calculating which classes were actually modified (or created) by comparing your feature branch with the main one, and then passing only those classes to SonarQube. That way, the analysis scope would be reduced from -in our case- ~5000 classes to only, 2 or 3, depending on how large our story or task is. In terms of timing, this can reduce the duration of your analysis significantly, as the duration is really proportional to the number of classes to be analyzed. Sounds good, right?

Three Jenkins runs performing Sonar Analysis. The one at the bottom was performed using incremental analysis, while the first one failed

And here come the pitfalls. SonarQube does not support incremental analysis as a feature. Instead of that, we needed to implement it by ourselves. But let’s not focus on how did we implement this (quite a simple solution, just some git commands and the sonar.inclusions parameter), but on why the official SonarSource team does not like this idea.

First of all, incremental analysis is less accurate. Example: We remove a call to a function that is defined in a different class. That function is not called anywhere else in our project. The Pull Request or feature branch quality gate will pass, but once merged into the main one, the analysis will fail due to a major code smell.

Besides this, for Pull Requests we know that in the main screen on SonarQube there is always an estimation of the overall code coverage that will result in the main branch once the merge is performed. However, with this solution, that percentage will be based only on the code that has been analyzed and not the overall one. The same will happen in the “Overal Code” screen for feature branch analysis.

Estimated coverage after merge is based only on the classes that were used in the incremental analysis

So, basically:

Pros:

  • Much faster analysis

Cons:

  • Less accurate
  • Overall code coverage estimations based on only the classes analyzed, not the whole project
  • We need to maintain it by ourselves, as SonarQube does not support this functionality

Conclusions? Well… if you ask me, I wouldn’t introduce this approach unless you really have a problem with the performance of the Sonar analysis.

In our case, we have introduced it in only one project out of fifteen we have in SonarQube. For this exceptional case, we do believe that the advantage compensates for the pitfalls by far. Even if we would introduce a code smell in the main branch, detecting and fixing it would take us much less time than just performing a single analysis of the feature branch. In that case, with the numbers we have, we would be able to perform more than thirteen analysis in the time we were performing just one in the past. But of course, I wouldn’t recommend this approach when the numbers are not that impressive, as the time gained would not compensate for the potential later feedback that incremental analysis would imply.

--

--