Guy Lev
skai engineering blog
3 min readApr 3, 2019

--

Meet “TakalushK” — a test recommendation microservice

After more than 10 years of constant development, Kenshoo’s code base includes millions of lines of code and hundreds of repositories. In the perfect world, we would run all our tests for every single change we make. But this approach is unrealistic because of how long it would take and the fact that doing so requires an enormous amount of resources.

Developers can’t be familiar with everything, and it is not always clear which tests will yield the best coverage for each change.

In short, deciding which component tests are best to run on a Pull Request (PR) can be quite a challenge.

Before “TakalushK” developers at Kenshoo had to directly ask each of the different component owners which tests they should run, or just make their best guess.

In Kenshoo’s 2018 Hackathon, we came up with the idea for a service that will recommend which tests to run according to the changed files of a PR:

After presenting a simple demo in the Hackathon, we generated great excitement, and in turn, high demand for this product. And so, not long afterward, we integrated our solution with our biggest code base.

Since then, “TakalushK” became a widely used tool in-house at Kenshoo, one that is invoked every time a PR is created on two of our biggest repos.

How does it work?

By now, you’re probably wondering, how does TakalushK know which tests to recommend?

When we came up with this idea, we knew that developing a super-complex static code analysis program is not the best approach — at least not for the scope of this project.

And so the next best thing was to determine which tests to run according to what tests people had run in the past when those same files were modified.

We accomplished this by developing a “learning” algorithm that uses the v3 GitHub API to obtain its input data. This API allows us to extract what files changed and which tests ran on each PR:

Each row represents how many times a certain test ran for a specific file.

Now that we have this table ready, TakalushK can easily get the top tests according to a list of files.

For example, if we create a new PR modifying the following:

  • data/FBCreativeType.java
  • reports/BingReportUtils.java

The following tests will be recommended:

  • ReportsInfra
  • DataFusion
  • FBLocal

It’s a simple approach and gives us great results. Each time a PR is made, TakalushK runs and recommends the top tests.

What’s next?

We are still working on making TakalushK the best it can be. A few ideas we have to improve it include:

  1. Count tests per packages (not just per specific file) will give better support for files that are rarely changed or newly created.
  2. Learn from user interactions by scanning the open PRs again, to see which of the recommendations people ran. Because we can generally assume that the people know what they’re doing, we will up their score.
  3. Allow manual intervention via a simple UI, so users can set the best recommendations for specific files or packages.

We’re also considering making TakalushK more configurable and making it open source so that it can be used by other organizations.

If you would like to see this project open-sourced, give this post some claps 👏 👏 👏

--

--