Test Automation Demystified, Part 3: Choosing a Test Automation Tool - 8 Features That Matter

10 min readJun 12, 2019

When Robinson Crusoe wrecked on a desert island his broken ship became a valuable source of provision. It is easy to imagine that his life on the island could be much more difficult without survived food, drinks, clothes, weapons and tools. He had everything for his immediate needs and for sustainable development of his household. Remove anything from this list and get a potentially dangerous situation: starvation without food and drinks in short term, starvation without weapons and tools in long term and suffering from natural conditions without tools to build a place for living.

We are like Robinson when starting a new test automation project. Without proper tools and skills, we doomed to fail.

Let’s talk about tools that matter. What features our test automation tool must provide?

I’ll begin with food and drinks of the test automation world: identification of elements and user input simulation.

1 — Identification of Elements

There are different ways of locating a UI element within an application under test.

By Coordinates

We may assume that an element is always positioned at the same place on a screen. All we need to remember in this case is mouse coordinates. This method works great until the application we test is shown at a different screen location, or on a screen with different resolution or layout of elements within the application is modified. It’s like a parking lot. Today our car is parked at A5, but next day there can be another car, coordinates are no longer valid.

By Image

Image recognition works great these days. So, we can remember how an element looks like and then use the image to find the element on screen during test playback. This method works well until we change application design or there are many similar looking elements. If draw an analogy with face recognition — a system may fail to find the right person if we talk about twins, people wearing masks or when we need a person playing a particular role, for example, football team captain.

Searching for the restaurant icon will return multiple matches. Sometimes it is acceptable, sometimes not, depends on a testing scenario.

By Path

Automation tools recognize hierarchy of elements within applications. Most popular and reliable way of locating elements is remembering parent-child relationships and element attributes. This is similar to postal address: you specify country, city, zip code, street name, building number, person’s name and correspondence easily finds its way.

If we test a web application such path is an XPath expression:

/html/body/button[@id=’LoginButton’]

By Match Probability

When an element cannot be found by one of methods described above it means that either the element is not displayed or its properties have changed. Imagine that a button that we used to find on a web page using button name — now has different name. E.g. Log In was renamed to Login. It is not a problem for manual testing, but can break an automated test. To help fixing such a test automated testing tools try to find an element which is more similar to target element than any other element on screen. Degree of similarity is expressed as a number from 0 to 1. Probabilistic match capabilities are under active research and development and can be found in some tools already.

2 — User Input Simulation

When element is found we typically want to click on it or enter text. An automation tool must be able to do it. There are two major ways of input simulation.

First, emulating real keyboard and mouse input via low level operating system events. It works great for both desktop and web applications. The limitation is that you cannot run tests emulating physical input devices in parallel or do anything on a computer at the time of execution of such an UI test.

Second way is to send application level events. It works with Web applications and allows to execute tests on remotely located or headless browsers.

Element identification and user input simulation are two must have features for any test automation tool. Without any of them UI test automation is not possible. Other features are optional but still very important.

3 — Application Inspection

Inspection or spying is the way to analyze internal hierarchy of UI elements inside an application. Here is an example of a UI Automation tree of a Windows desktop application:

Here is an example of a DOM tree of a web application:

Why analyze hierarchy of UI elements? There are two main reasons:

Build a better path for identification of an element. Better means resistant to changes in the application under test. Most frequently it is used for Web applications.
Understand internal structure of a complex UI control, like tree or table to implement high level actions such as clicking a cell at specific column and row.

4 — Recording

Recording allows to reduce time to build a test. During recording an automation tool intercepts keyboard and mouse, captures user actions (like click on a button or entering text into a field) and captures information it needs to identify UI elements during replay. After recording we get:

Data for automatic identification of UI elements within the application being tested
and steps user performed during recording.

For example, identification data for Username_ element captured during recording of the test shown above includes path:

//input[@id=’MainContent_LoginUser_UserName’]

Recording may significantly reduce time to calculate path of an element.

5 — Codeless Testing

Codeless means creating tests without programming. First step to codeless testing is recording. Second is ability to modify recorded steps without writing code in a text editor.

Third — ability to parameterize a test and make it data-driven without coding.

There are other features that contribute to codeless testing. I’ll describe them in the next article.

6 — Reporting

Test execution is a crime scene. If something bad happens, we should have enough evidences to find the culprit.

Several assistants can make investigation easier.

Auto Reporting

The report should register all interactions with UI elements and their outcome. We should know exact point in a test where problems started. Let’s assume we have a sequence of test steps:

And LoginButton is not available on screen. After test execution we automatically get:

Assertions

In some cases we want to check an assumption about application state. For example that we see Log Out button if a user is logged in.

In this case a report should display why a specific assertion fails.

The report shows that Log In text was found where Log Out was expected.

Screenshots

A picture is worth a thousand words. There must be an ability to take screenshots at certain points during test execution. Let’s take the previous example when login failed for some reason. If we’ll take a screenshot after pressing LoginButton we’ll quickly find out what happened from the report.

7 — Maintenance Tools

That is what helps keeping tests in a good shape, helps to fight flakiness and adapt to application changes.

Resilient Locators

Developers change applications. Usually to make existing features better and add new features. This may lead to changes in UI. So UI elements may change their attributes and layout. For example, a button may be renamed to better reflect its purpose or increase probability of a user click. A button may be moved to another location on a page. This may lead to the situation when information an automation tool captured about the button during recording [let’s call it locator] is not enough to find the button in new version of the application. In this case the locator must be fixed. If, however, the locator is good enough to be still valid after small changes in the tested application it is called resilient.

Let’s look at the simple example. Assume we have a web page:

And on recording we remember the text of the button: Login. If developers rename the button to Log In then our locator will fail to find the button, there is no more button with name Login. If we remember id of the button: LoginButton then no changes to button name will break such a locator, we’ll refer to it as resilient.

Re-Learn Locator

If locator of an element is broken then there must be a quick way to re-learn it. It can be done semi-automatically. An automation tool may ask a tester to click on the element and then generate new locator automatically.

Or the tester can use application inspection (Spy) to build a better locator manually.

Probabilistic Matching

So far we talked about locators that require 100% match of an element to information we capture about it during recording. If we remember element’s name and the name changes — the locator is considered broken. If we remember id of an element (remember LoginButton) and the id changes — the locator is considered broken too. Normally if after application update an element still exists on a page it has just a few attributes changed but a lot of information about the element remains unchanged. This observation may be used to try finding an element that matches most (for example with >95% confidence). Then a tester may decide if found element is what was actually needed or not. Furthermore, if probability of element match is greater than some threshold, for example 99%, the locator can be fixed automatically, if less — then a quick assistance from a tester may be requested: a question to confirm matched element or decline.

Thus, we get three mechanisms. A resilient locator reduces the number of cases when maintenance is needed. Re-learn provides ability to fix a test manually (relatively fast and easy). Probabilistic matching provides a way to fix a test fully automatically or with super-fast manual effort.

8 — Test Management

When you make the first automated test for an application it is a move from 0 to 1. You solve problems like:

how to reliably identify elements
how to better interact with elements
how to deal with complex controls like trees and tables
how to validate application state
and many more of this kind …

Then you start creating many tests and this is a leap from 1 to many. This is where you need test management tools. And your automation tool should either contain test management features or be very well integrated with a standalone test management solution.

Some of the major features of test management are

Test case management
Test case execution
Reporting

Test case is a unit of test automation. It can be linked to requirements, to an automated test implementing the test case and to bugs found during execution of the test.

When tests are ready, we want to execute them. It should be possible to do it

based on a schedule (daily, monthly),
on-demand (immediately),
and upon specific events (for example application build).

When tests are executed, we want to analyze results.

Integrated Solution

When all components described above are integrated into a solution it saves a lot of time. It’s like Robinson having food, drinks, weapons, tools and clothes on one ship rather than 5 different ships scattered along the coast line.

Integration provides many benefits.

Recording, modification, execution, debugging — without switching between different tools.

Ability to inspect an application and capture locator of an element to fix a test or make it more resilient to application changes.

Run tests whenever you want and get results in centralized repository for analysis.

The features I described in this article are just a top of an iceberg. They are the most important ones. They make life possible on the desert island. Like Robinson we can survive with them. There are, however, many more features that bring the comfort of city life: advanced application inspection, organizing tests into a framework, extensibility — support for elements with complex internal structure like trees and tables, test data management and others. But that’s another story. See you next time.

With questions reach me on twitter @dmarkovtsev, I am always happy to help :-)