Accelerate Web Test Automation, Part 1

Performance might be the last thing to think about when developing test automation. However, sometimes performance of your tests can become so important that test results may be exacerbated. In this Accelerate Web Test Automation series, we’re going to talk about various performance improvements that we’ve done to our web test automation code infrastructure.

Start Simple

We (Testarmada team) announced Magellan-Nightwatch (MN), the Nightwatch.js adaptor, back in 2014. By injecting jQuery into the page where the command/assertion is going to be executed via selenium’s execute api, MN will always make sure that the element you want to operate is safe to be operated.

For example, if you want to click on one button with id `submit` via MN’s clickEl api, MN won’t click unless the button is proven to be visible for three consecutive checks, each of which performs a jQuery :visible on the button, and return true if the button is visible.

MN had been used as our major patch to nightwatch.js in WalmartLabs since the day it was invented. Our test automation teams were so spoiled by jQuery’s awesome selectors, like :visible:eq, etc. that they don’t have to worry about adding redundant commands to check if the element they want to operate is operable to avoid the flakiness. In late 2015, http://testarmada.github.io was chosen to be the standard automation solution for WalmartLabs’ React platform. All git pull requests need to get a full pass of smoke tests to keep things going. Things are too good to be true.

As traffic grows, there is always performance improvement to be done

In early 2016, the first performance issue came to us. One of the teams contributed more than 170 `smoke` tests, thus the suite execution time exceeded 1 hour, meaning for that team, people had to wait for at least 1 hour to have code reviewed and merged. Hmm…

Why slow?

The way how MN did the visibility check was, it kept sending selenium requests to execute jQuery’s :visible selector until there were three consecutive true from the response. Let’s do the math. If user wants to call a clickEl, MN will do at least 4 selenium calls (3 visibility check with positive response and 1 click) all together. This 4-selenium-call pattern happens to all the MN commands and assertions as they follow the same way of dealing the :visible check. The amount of the selenium calls causes the slowness. Even worse, the slowness can be exacerbated by a slower network connection.

Let’s do the high level abstraction. Assume all the selenium calls take the same time, say n milliseconds. And given a test suite, the amount of selenium calls would be (m + 4), where m is the negative response number from visibility check. Also, there is another factor w, the concurrency number per execution. So here comes the equation

t (total suite execution time) = (n * (m+4)) / w

With a fixed n, We can shorten t by neither increasing w or decreasing (m+4). To increase w, we introduced sharding, meaning we can have more concurrency per execution. Sharding won’t be covered in this part 1.

To decrease (m+4), we need to get rid of as many selenium calls as possible.

Can we decrease m?

Couple of nondeterministic factors can impact m, like network speed (the slower the network is, the larger m could be), or the processing speed of the app server under test (the slower the processing speed is, the larger m could be). Given less context of the execution environment, we cannot really predict how m would be.

Can the 4-selenium-call pattern be changed?

Before answering the above question, let’s take a closer look at the 4-selenium-call pattern. Three out of the four calls are actually doing the same thing, inject jQuery if needed and perform jQuery selector. But most importantly, the injection is via selenium’s synchronous execute api, meaning each jQuery injection and selector operation will occupy a single selenium call. However, things don’t have to be this way.

Can we do more things in one selenium call?

The answer is definitely YES. But it doesn’t mean we can do it with selenium’s synchronous execute API. What if the injection happens at the moment page is still being loaded? And what if all the injections happen at that moment? Since we cannot assume how long the page takes to be fully loaded, there has to be a delay between each injection. However, execute API won’t wait for the delay, it always returns immediately. To do more things in one selenium call, we cannot use the synchronous execute api anymore.

As most of you might already find out, the answer is to use the asynchronous version of execute API, executeAsync. The idea is correct. Different from returning immediately, executeAsync will call the callback once it fulfills everything. So, the 3 selenium calls can be reduced to 1 selenium call, where we can have a loop to check the element’s visibility by executing jQuery selector and use setTimeout to delay in between each steps. Sounds exactly what we need, right? Is this solution good to go?

The hidden benefit from using synchronous execute API is, it will always return no matter if the page is ready for JavaScript injection or not. The executeAsync, however, cannot guarantee this. The executeAsync API looks gold, yet not good enough.

What if we mix both benefits of synchronous and asynchronous? The function would return immediately if the page isn’t operable (still loading resource, rending or wrong page), and do the asynchronous visibility check only if the page is ready (JavaScript injectable).

Yeah! This is it!

How would this solution help?

The optimization comes from the element’s visibility check when the page is ready.

  1. If the page isn’t ready, the executeAsync API works the same as execute api.
  2. If the page is ready, the 3-selenium-API calls now can be packed into 1 call.

The measurement

Time to test the improvement. By running two bi-hourly tasks using electrode checkout with 171 tests in three browsers (in total 513 tests), one with the performance improvement library one with the old library, for 10 days, we get the following data.

  • Non-optimized: on average, 41 min for magellan tests to finish (mean of 35 samples)
  • Optimized: on average, 28 min for magellan test to finish (mean of 35 samples)

Improvement: (41–28)/41 = 31.7%

Conclusion

Things become better and better with the first performance improvement. If your test automation infrastructure is similar to ours, and you’re suffering from a slow execution, please try this and see if it helps. Please leave me a message here in the comments to share your experience, so I can respond. If you have specific things you’d like me to write about regarding this article or web test automation, let me know and I’ll do my best to share what I’ve learned.

We’re still learning and growing. Stay with us.

To be continued…