AI-based Test Automation Tools

Iryna Suprun
Jan 7 · 6 min read

Part 2: An Overview of AI-Based Test Tools

Photo by Markus Winkler on Unsplash

Part 1 of this article “The Current State of AI in Testing” can be found here

Part 3 of this article, “AI Automation in the Wild,” can be found here

Part 4 of this article “Conclusions,” can be found here

Test authoring. Initially, the most visible difference between traditional and AI-based QA tools is the way tests are authored. When traditional tools are used, it’s a manual process. That starts with defining test cases, validation and assertions, and then, ends with coding and testing that the test script actually works. Most AI-based test automation tools record an actual user (test engineer or customer) manually executing the test case and then create the automated tests from that. Some testing tools that do not use AI also rely on recording as a way to create automated tests. The main difference is that AI tools collect tens of data points (while recording) that are later processed and used for generating and improving automated tests and increasing their stability. Other, more sophisticated, AI-based tools use different ML algorithms to auto-generate assertions and tests using logs, collecting real user clicks, or just software under test itself.

Test maintenance: Keeping test automation code up to date is a common pain point when we use traditional automation tools. That’s because it is manual and laborious. Tools that use ML algorithms offer self-healing features so that tests are automatically updated to reflect changes. AI-based tools automatically adapt to UI changes, such as xpath, tag, or other attributes. They use data collected during test creation to identify the same element on the page. The user can still decline the suggested test change and roll back to the previous test version.

Support: AI tools are not mature or widely used compared to traditional testing tools. As a result, if users encounter a problem, they most likely need to go to the tool support team. There is less information available about AI tools in general. For example, comparison charts, use cases, real reviews, and documented experiences of real users are hard to find. So, if you are considering adopting an AI tool, be ready to do a lot of leg work by yourself.

Cost: There are plenty of open-source or free versions of traditional test tools. Open-source versions of AI-based test tools are rare, as are free versions of these tools.

Compatibility: There are plenty of traditional testing tools for every type of application, but most AI tools only support automation of Web applications. There are only a couple of AI tools that provide automation support for mobile apps.

Almost every modern testing tool available on the market claims some usage of AI. AI-based tools can be grouped into three main categories:

  • The first group consists of intelligently designed tools that can help solve some automation issues but don’t use any ML algorithms. These tools really only have AI in their promotion materials. This does not mean that these tools can’t still solve some testing challenges, or that they are worse than AI-based tools. It only means there are no actual ML algorithms used in their code. As a result, these tools are not part of the subject of this paper and are therefore excluded from future discussions.
  • The second category consists of tools that use AI/ML in a supporting role. These tools help QA groups perform certain tasks that are, due to human limitations, hard to do manually (e.g., tasks that rely on visual testing). Another area where they excel is when performing testing that, if done by people, would use subjective measurements instead of a set of objective parameters (e.g., video/audio quality). Tests authoring with such tools is done with the active involvement of end-users (e.g., test generation based on collecting, extracting, and processing logs, clicks, and events) or members of the engineering team (test recording).
  • The last category of AI-based tools is made up of those tools that take software under test as their main and only input and generate bug reports as output without any human interactions (Level 5 of autonomy). There are no tools actually on the market right now that are Level 5 tools. There are some tools (not many) that have Level 5 features, but most of their offering consists of Level 2 — Level 4 features (description of test automation autonomy levels can be found here)

As the majority of testing tools use AI/ML in a supporting role (the second category), this post will focus on them and discuss the features that make them very solid competitors to traditional tools.

One area where AI/ML-based tools excel are features that help decrease the time spent on tasks that people normally perform. These types of tasks include the following:

Codeless script generation. Writing code for complex end-to-end scenarios can take as long as the development of a new feature. However, with codeless script generation, even non-technical people (e.g., end-users conducting User Acceptance Testing) can record their actions which a smart tool will then convert into automated tests. Even so, an automation engineer will still likely not be able to just add these recordings to the test suites. Some manual work will still be required, such as setting up users, ensuring object names comply with naming conventions, etc. Nevertheless, this feature can significantly decrease the time spent on the initial automation of test cases and expand coverage.

Self-healing. Test case maintenance is probably the task most hated by developers and QA staff. It takes a lot of time and slows everyone down. Because AI tools can collect a lot of information about every element on the webpage, if one attribute is changed, they can still recognize this element and proceed with the test execution. Some tools also collect information on how the application is used (e.g., user flows, errors, etc.) and are able to recognize insignificant changes and adapt.

Converting test documentation to automation. Using natural language processing mechanisms, it is possible to convert tests written in English to automated tests. Some tools advertise that they can do it with both structured test plans and unstructured user journeys.

The second group of features, those that perform tasks that are very challenging or almost impossible for humans — includes the following:

Autonomous building of test cases from the usage traffic of real users. AI-tools collect analytic data from an application’s clickstream, analyze it, and create test cases based on real system usage. They identify core patterns (sequences) and then run them in the test environments to improve their scripts using ML algorithms (e.g., remove optional steps and duplicate flows).

Autonomous building of test cases by analyzing the code of the application under test. Bots build a software map by exploring each path through the app and then use that to create a set of use cases.

Visual Testing. This form of testing evaluates the visible output of an application and compares that to the results expected from the design. Some may think this is a task better performed by humans, but it’s very time-consuming and we often miss things that a computer would not. Often, humans pay less attention to things they see many times a day or miss obvious differences (e.g., spot 10 differences between two pictures). Also, humans can’t realistically look at all screens multiple times a day and notice every difference. Automated visual testing allows us to test the whole UI, repeatedly. In addition, using ML algorithms helps to decrease the number of false positives by identifying changes that do not impact the user experience and safely ignoring them.

Audio/Video Quality Testing. This form of testing used to be a very manual task that required listening to the audio or watching the video and giving it a subjective score. Today, AI can collect multiple data points about audio/video and how their variations impact human perception. These tools can perform this testing faster based on that data, make it more objective, and ignore variations that are not important for humans.


Our latest thoughts, challenges, triumphs, try-again’s…

Iryna Suprun

Written by

I started testing in 2007, cannot stop since then. The software hates me and never works as expected, so I guess I was born to be a QA.


Our latest thoughts, challenges, triumphs, try-again’s, most snarky and profound commit messages. Our proudest achievements, deepest darkest technical debt regrets (just kidding, maybe). All the humbling yet informative things you learn when you try to do things with computers.

Iryna Suprun

Written by

I started testing in 2007, cannot stop since then. The software hates me and never works as expected, so I guess I was born to be a QA.


Our latest thoughts, challenges, triumphs, try-again’s, most snarky and profound commit messages. Our proudest achievements, deepest darkest technical debt regrets (just kidding, maybe). All the humbling yet informative things you learn when you try to do things with computers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store