Enhancing Automated Testing with test.ai

A look at the good, the bad, and the ugly of state-of-the-art testing frameworks.

Zaheen Ahmed

Published in

Equinox Media Tech

7 min readApr 28, 2021

*Photo by:* *ThisIsEngineering RAEng on Unsplash*

Introduction to Automation and Appium

Test automation has one primary objective: make machines test your app so people don’t have to. When you make a mobile app, you ideally want to make sure every aspect of it works perfectly on every conceivable device your app will run on. The obvious obstacle here is that your manual testers probably don’t have the time to crawl through your app on every device, and even if they did, who wants to repeat the exact same monotonous test steps on a hundred different devices?

Enter automation: a collection of test scripts that execute your test steps automatically. Two examples of test automation tools are Appium and test.ai. Appium has been a mainstay in the mobile automation scene for a long time, and there are good reasons for it not falling out of favor. On the other hand, test.ai is a relatively new player in the test automation tool scene that is gaining popularity.

Which tool you use really depends not only on the skill of your test automation engineers, but also on the software you are trying to test. Appium allows programmers to quickly and easily write scripts that can interact with — and verify elements on — mobile apps. And as a bonus, the same script (generally) works for all devices. Traditional automation tools like Appium and Selenium are lightning fast in execution, and excel at automating apps that are heavily text based.

Unfortunately, despite how powerful Appium is, it still has some inconveniences that can cause headaches for automation engineers.

Where Appium Falls Short

One problem with Appium is maintaining your scripts after creating them. The way Appium detects elements in the application is through Accessibility ID’s and XPath’s, which can basically be thought of as “address” strings that point to the location of an element in the app’s DOM. Most of the time these locator strings are easy to manage, but occasionally you’ll be forced to use something like this:

//android.view.ViewGroup[@focusable=’true’][parent::*/parent::*/following-sibling::*//*[@text=’Filter’]]

That is an actual XPath used in our automation script to find the in-app back button at the top left of one of the screens. In many other instances just like this, there is no easy way to detect an element (e.g., with an accessibility ID), so these complicated XPath strings become necessary. The problem with such solutions is that they break if the UI of the screen changes due to a new update, causing the test scripts to fail until an automation engineer (read: me) updates the XPath, which only delays the issue.

Of course, since Appium is so commonly used in the industry, you would expect that many intrepid engineering teams have attempted to alleviate this issue, and you would be correct. In this post, we will look at one such software: test.ai. The goal of the developers behind test.ai is to not only make automation testing easier by solving issues like this, but also to leverage the power of AI and Machine Learning to add more powerful features to your automation suite.

What test.ai Does Differently

test.ai, in simple terms, is an automation solution similar to Appium. Under the hood of their fancy fixes and features, they still use the same core technologies Appium uses to control the mobile device. The main feature that distinguishes them from standard Appium is a powerful image recognition machine learning model that is trained on a multitude of apps, and is able to efficiently and accurately detect elements on the screen using images instead of XPaths. One benefit of this approach is that the model can detect elements on the app regardless of the screen dimensions or element positioning. We’ve experienced this feature firsthand, as we’ve used test.ai to automate our apps both on conventional mobile devices and on our much larger proprietary exercise bike tablets.

test.ai’s user flow follows three basic steps: Labeling, Test Creation, and Test Execution and Reporting.

In the Labeling step, you simply start up test.ai and connect it to your mobile app, and then on test.ai’s user interface, label every element on every app screen you are interested in. These labels replace the original IDs and XPaths you would use in Appium. The beauty of this labeling process is that once the image recognition model knows how an element looks, it will be able to find it on the page, even if its position on screen or the DOM changes. In practice, this means no more painstakingly updating XPaths every time a new build comes out.

Then, in the Test Creation step, you use test.ai’s UI to list all the steps your test will take, such as tapping on elements, entering some text, verifying an element is present on screen, and so on. The great thing about this test creation process is that it can be fully scriptless, allowing even those without programming or automation experience to write and maintain tests. Furthermore, while the basic flow is accessible to anyone regardless of programming skill, test.ai does allow you to inject scripts into the tests if necessary (for example, to make API calls, or fetch some data from your database, etc.)

Finally, all that is left is to execute the tests, in the Test Execution and Reporting step. The tests will be executed step by step, using the labels you assigned in the labeling phase to find and interact with elements on the page. After the tests are complete, you can view the reports in the UI. The reports contain screenshots of the app screen every time a test step is executed, with a highlight around the specific element that was interacted with in that step.

All in all, this simple user flow makes test creation, execution, and maintenance extremely simple, removes the pain points of using barebones Appium, and on top of that adds high quality image recognition to expand your testing possibilities. Remember that long, ugly XPath I used to find the back button before? With test.ai, that process became as simple as labeling the button something like “back_button”, and that was it: test.ai could now detect that button on any page, and interact with it during test execution.

Not All Good Things

Up until now I’ve been singing test.ai’s praises, but to convince you I’m not just a shill trying to sell their product, I will be mentioning any gripes I’ve had with using test.ai. However, to be fair to them, while some of these downsides are unavoidable side effects that come hand in hand with the benefits, others can be mitigated with some persistent dev work. Time to call it as I see it:

The Setup: the setup process to install test.ai on your machine is a bit too convoluted for my liking. You have to download their executable, open one terminal instance to go a couple directories deep into the executable and run one command, open another terminal to do the same thing and run a different command, do a one-time download of some external scripts, put those in your local user binaries folder, and give them appropriate permissions, before you can finally access the UI and start using the product. To the developers’ credit, they have been very quick to respond to feedback and fix issues, and they have stated that one of their goals for the near future is to streamline this process and make everything be doable through the UI.

The Labeling: this one is unavoidable due to how test.ai works, but I’d be remiss not to include it. The initial labeling phase takes a ton of time, because you have to go through every page of your app that you want to test, and label every element you want to verify or interact with. The bright side is that after you’ve completed this step once, you will rarely need to relabel anything (only if an element’s appearance changes due to an app update), greatly cutting down the maintenance needed for your test scripts.

The Performance: here is the big one: I’m sure many of you realized this already, but there’s no way a service that runs an image recognition model every time the screen changes can possibly match the performance of pure Appium. And it doesn’t: the overhead added by this process is rather significant, resulting in anywhere from a 3- to 5-times increase in test runtime. However, there is hope: as mentioned before, test.ai allows you to bypass their UI based test creation process by injecting your own scripts into the test. Within this script, test.ai actually allows you to attach an Appium driver to the currently running automation session, so you can get the best of both worlds: test.ai’s normal features allow you to overcome Appium’s issues with XPaths by using image based element location, while Appium’s speed can be utilized in any test that doesn’t have those issues, where test.ai’s features aren’t necessary.

In Conclusion

Ultimately, despite the flaws noted, test.ai is a fantastic tool to augment and improve your test automation framework. It allows testers to easily create and run tests using a UI-based labeling, test creation, and test reporting solution, while also retaining the power and flexibility of being able to use scripts. Our mobile app focuses heavily on video content, so test.ai is a natural fit, but we can still use Appium for the more static elements in the app. Because, at the end of the day, if you still prefer standard, no-strings-attached Appium, nothing is stopping you from using it alongside test.ai to get the best of both worlds.

If this post has piqued your interest, test.ai does offer a demo plan that you can play around with on their website. Trust me, I’m not getting paid by them, I genuinely believe it is a product to invest in for a stronger and more robust automation solution for your app. If you don’t believe my word, try it out yourself.