Less than a couple

Another way to reduce the number of tests

Mikhail Vasin
IT’s Tinkoff
11 min readAug 11, 2023

--

originally written by Palagina Mary

Every QA knows pairwise testing — a method for minimising test cases. The method is excellent, quite simple, and has been tested by many teams. But what if, after applying it, you still have too many cases?

That’s exactly what happened in my project, and today I’m going to tell you how to reduce the number of test cases even further, without sacrificing quality.

Object under test

Let me tell you a bit about the product first. At Tinkoff, our team develops blocks. These are React components that consist of an implementation and configuration. The implementation is the component itself that we developed and that a user sees in the browser. The configuration is a JSON file that sets parameters and content for that object.

The main task of blocks is to be beautiful. They should be displayed the same way for different users. At the same time, a block can change significantly depending on its configuration and content.

For example, a block might look like this: no background, with a button and an image on the right:

Or it could look like this: with a background, without a button, and with an image on the left side:

Or even like this: with a link instead of a button and without listing the text:

All of the above examples are the same block, with the same configuration version (the JSON structure that this particular React component can handle). However, the content is different.

This is the schema itself:

{
components: {
background: color,

panel: {
panelProps: {
color: {
style: ['outline', 'color', 'shadow', 'custom'],

background: color
},

size: ['s', 'm', 'l'],

imagePosition: ['left', 'right']
},

title: {
text: text,

size: ['s', 'l'],

htmlTag: ['div', 'b', 'strong', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']
},

description: {
text: html,

htmlTag: ['div', 'b', 'strong', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']
},

image: {
alt: text,

title: text,

image: {
src: image,

srcset: [{
src: image,

condition: ['2x', '3x']
}],

webpSrcset: [{
src: image,

condition: ['1x', '2x', '3x']
}]
},

imageAlign: ['top', 'center', 'bottom']
},

button: {
active: boolean,

text: text,

color: {
style: ['primary', 'secondary', 'outline', 'outlineDark', 'outlineLight', 'textLink', 'custom'],

backgroundColor: color
},

onClick: {
action: ['goToLink', 'goToBlock', 'showBlock', 'crossSale', 'callFormEvent'],

nofollow: boolean,

url: url,

targetBlank: boolean,

title: text,

noindex: boolean,

guid: guid,

guidList: [{
guid: guid
}],

formId: guid,

crossSaleUrl: url,

eventName: text
},

htmlTag: ['div', 'b', 'strong', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6']
},

href: url
}
}
}

In this case, a block with an image on the right will have components.panel.imagePosition = right. A block with an image on the left — components.panel.imagePosition = left. And a block with a button — components.button.active = true and so on. I hope that the principle is clear. This is how all the parameters of a block are set.

Cases from a combination of parameters

In this article, I am not going to talk about issues with versioning the scheme of blocks, rules for filling content, or where the data in the block comes from. These are separate topics that don’t affect the compilation of a set of test cases. The main thing to know is that we have many parameters that affect our components. Each of these parameters can have its own set of values.

I have chosen a block with a fairly simple configuration for the example above. However, it will still take an unacceptably long time to check all combinations of values for all parameters. Especially if you have to consider cross-browser compatibility. Pairwise testing tends to help here. Tons of articles have been written about it, and there are even tutorials available. Be sure to read some of them if you haven’t already.

Let’s have a look at how many test cases we’ll get with its use. We have more than 25 parameters, and some of them take 7 and 9 variants of values. Of course, you can ignore some of them: for example, if you are checking the layout, the guid is not important to you. But if you use pairwise testing, you still get more than 80 test cases. And that’s not for the most complex block and without considering cross-browser compatibility, as I wrote earlier. We now have over 150 blocks, and the number is growing. So if we want to maintain the speed of testing and releasing new versions, we can’t allow that many cases.

Cases from a single parameter

Pairwise testing is based on the assumption that most defects are caused by the interaction of no more than two factors. In other words, most bugs occur either when testing a single parameter or when testing two parameters combined. We decided to ignore the second part of this statement. We assumed that most bugs would still be found by checking only one parameter.

Then we found that we needed to check each value of each parameter at least once. But at the same time, each block contains the entire configuration. To minimize the number of cases, it is therefore possible to check the maximum number of values that have not yet been checked in each new case.

Let’s analyse the case creation algorithm using a simplified example. We will take the button component from our schema and create test cases for it:

 button: {
active: boolean,

text: text,

color: {
style: ['primary', 'secondary', 'outline', 'custom'],

backgroundColor: color
}

To simplify the example, I have shortened the length of the list in button.colour.style.

Step 1. Compose content options for each field

It’s just like pairwise testing: you need to understand what values each of the fields can take. For example, button.active can only take two values: true or false. In theory, there could be other possibilities, such as undefined value or the absence of the key itself.

In my opinion, it is important to define the boundaries and functionality of your system very clearly. Do not check unnecessary things. In other words, if checking that keys are mandatory or values are valid is implemented in a third party system, then that functionality needs to be checked in that system. And we should only use “correct” data as cases.

The testing pyramid broadly follows the same principle. If you wish, the most critical integration tests can be added: for example, verification of the processing of a failed key. But there should be a minimum number of such tests. The other is to try to test everything, which everyone knows is impossible.

That’s why we have defined variants of the content for each field and have created the following table:

This table includes each equivalence class of each parameter, but only once.

Here are the classes of values ​​in our case:

  • text_s — short string;
  • text_m — longer string;
  • no_color — no color;
  • rnd_color — any color.

Step 2. Enrich the table with data

Each block contains a complete configuration. Therefore, we need to add some relevant data to the empty cells:

Now, each column will be a single case.

At the same time, we can generate cases based on priority, as we choose the missing data ourselves. For example, if we know that the short text in a button is used more often than the medium text, then we should check the short text more often.

In the example above, you can also look at the “dropped” cases. These are cases where a parameter is not checked at all, even though it is present in the table. In this case, button.colour.style: secondary is not checked for appearance, because it doesn’t matter what style the disabled button has.

We analysed the resulting value sets beforehand to make sure that “dropped” cases did not lead to bugs. The analysis was done once during the test suite generation, and all “dropped” cases were manually added to the final test suite. This is a rather clumsy way of solving the problem, but it is cheap (unless you rarely change the configurations of the objects under test).

A more universal solution is to divide all values into two groups:

  1. unsafe values (those that can cause cases to be “dropped”);
  2. safe ones (which can’t lead to “drop”).

Each unsafe value is checked in its own test case. You can enrich the case with any safe data. For safe values, a table is compiled according to the instructions above.

Step 3. Refine the values

All that remains is to generate specific values instead of equivalence classes.

Here, based on the characteristics of the object under test, each project will have to choose its own variants of values. Some values are very simple to generate. For example, you can choose any color for most fields. If you want to test the color, you need to add a gradient for some blocks, but this is in a separate equivalence class.

Text is a bit more complicated: when generating a string of random characters, hyphenation, lists, tags and non-breaking spaces are not checked. We generate short and medium strings from real text. We truncate them to the desired number of characters. And we check the long text:

  • html tag (any one);
  • link;
  • an unnumbered list.

Our block implementation follows directly from this set of cases. For example, all html tags are linked together. There is no point in checking each one individually. At the same time, the link and the list are checked separately. This is because they have separate visual processing (highlighting on hover and bullets).

It turns out that based on the implementation of the object to be tested, you need to create your own up-to-date set of content for each project.

Algorithm

At first glance, the algorithm may seem complex and not worth the effort. But it turns out to be quite simple if you leave out all the details and exceptions I tried to describe in each paragraph above.

Step 1. Add all possible values to the parameter table:

Step 2. Duplicate the values into empty cells:

Step 3. Make abstract values concrete and get cases:

Each column in the table is a single case.

Advantages of the approach

This method of generating test cases has several important advantages.

Fewer cases. The first and most obvious thing to notice is that there are significantly fewer cases than in the pairwise testing. If we take a simplified example with a button, we get 4 cases instead of 8 in pairwise testing.

The saving in cases is greater the more parameters there are in the object under test. For example, for the complete block presented at the beginning of this article, we get 11 cases. Using the pairwise method, we get 260 cases.

The number of cases does not increase as the functionality becomes more complex. The second advantage is that the number of cases does not always increase with the introduction of new parameters, which will be taken into account during testing.

For example, let’s add a parameter button.color.textColor with values no_color and rnd_color of equivalence classes to our button. Then there still be 4 cases, and just one more parameter will be added to each of them:

The number of cases will increase only if some parameter has values ​​more than the number of cases existed.

Important things may be checked more often. By enriching the values ​​(step 2 in the algorithm), higher priority or riskier values ​​can be checked more frequently.

For example, if we know that users used short text more often before, but now they use longer text, we can enrich cases with longer text and get into real user cases more often.

Can be automated. Automation is quite possible with the above algorithm. Certainly, the cases generated by the algorithm will look less like the real ones than the ones generated by a human being. At least in terms of color selection and text cropping.

But on the other hand, the cases are generated in the development process without the participation of a tester, and this greatly reduces the feedback loop.

Disadvantages

Of course, such case generation is far from a silver bullet and has its drawbacks.

Difficult to analyze results. As you may have noticed: test data is mixed when cases are generated. As a result, when a case fails, the process of identifying the cause of the failure becomes more complicated. This is because some of the parameters used in the case don’t affect the test result in any way.

On the one hand, this makes the analysis of test results very difficult. But, on the other hand, it also makes it difficult to find the cause of the bug if the object under test requires a large number of mandatory parameters.

Bugs can be missed. Going back to the very beginning of this article, when using this method, we allow the possibility of missing bugs caused by a combination of two or more parameters. But we win in speed, so it’s up to you what is more important in each specific project.

To avoid missing bugs twice, we adopted the Zero Bug Policy and started to close every missed bug with an additional test case — not automatically generated, but manually written. This gave excellent results: we now have more than 150 blocks (tested objects), several releases per day, and from 0 to 3 missed non-critical bugs per month.

Conclusions

If your object under test has a large number of input parameters and you want to try to reduce the number of cases and thus the time for testing, I recommend trying the above method of generating cases using a single parameter.

In my opinion, it’s ideal for front-end components: it’s possible to reduce the time by more than three times. For example, the time for testing of the appearance by means of screenshot tests. And the development will be faster due to the creation of cases at the earliest stages.

Of course, if you are testing the autopilot of the new Tesla, even the small probability of missing a bug cannot be neglected. But in most cases, don’t forget that speed is a very important quality criterion in the modern world. And the increase in speed gives more positive results than a few minor problems found.

--

--