Test Automation Demystified, Part 4: Friends and Foes of Software Test Automation

Alexey Grinevich
12 min readJun 25, 2019

--

Frend (left), Foe (right)

Automation is fragile. Simple development decision may seriously harm automation efficiency.

Good automation engineer is good because he has many workarounds in mind to solve specific problems. This knowledge helps to know in advance what to catch, where to look and what to avoid.

Friends

There are features not designed for UI Automation, but useful. We call these our Friends because we know how to use them for our purpose. Just like this cat knows how to use the door crack:

Door crack is a cat’s friend :-)

Keyboard

We may have different opinions about windows. One fact that makes it a friend of automation is that its GUI is keyboard-friendly. If your mouse is absent or down for some reason you may do mostly everything with just a keyboard, all you need is to know how.

Cursor Keys. Page Up/Down. Menu Shortcuts. The more such options are there in the application the better.

We recommend you to put your mouse away and try to do something with Application Under Test (AUT) using keyboard only to see your capabilities.

Clipboard

Clipboard is a way to grab text from the application. The way to control it is a set of well known keyboard shortcuts. For example, here is it for PowerPoint:

Shortcuts

Local and global shortcuts is a simplest way to access functionality.

For example: Ctrl+A — Select all. Together with the clipboard it makes possible to grab text value.

One of them is Tab. Shift+Tab, Ctrl+Tab. This is a way to transfer focus between controls.

Space — Make selection (for checkboxes and accept buttons).

Enter — Accept input, select current item, expand/collapse.

Backspace — return back.

For menu shortcuts it is common to display them when ‘Alt’ key is pressed:

1. Press & Release Alt — this will show the hints.

2. If we need ‘Folder’ then we press ‘F’.

3. Now we press A,F to do ‘Add File’

So the full combination is Alt, F, AF.

Date Picker Shortcuts

There are known date picker and calendar shortcuts that are more or less common.

For example this one:

Has the following shortcuts defined:

T — Today

Left, Right — select day (next, previous)

Top, Bottom — select week (previous, next)

M — Next Month, Shift+M — Previous Month

Y — Next year, Shift+Y — Previous Year

D — Next decade, Shift+D — Privous Decade (+- 10 years)

Sometimes date pickers have no keyboard shortcuts, such as:

So there is no guarantee that keyboard will work, but always worth to know that it is here. In this particular case it is not a problem — the picker is connected to the input field and we may type the date range directly.

APIs

REST

Good example is x3270 terminal emulator.

http://x3270.bgp.nu/

It is not friendly for UI automation, just like many other terminals. However, we found that it has a REST interface and fully controllable this way.

http://localhost:13270/3270/interact.html

I.e. this will move cursor to the position 10, 10:

http://localhost:13270/3270/rest/text/MoveCursor(10,10)

And this will dump text contents of the screen:

http://localhost:13270/3270/rest/text/PrintText(string)

So this terminal is fully controllable and we may interact with it to implement tests due to the REST interface. We used this feature to implement as400 support in Rapise.

COM/ActiveX

See ‘UsingExcel’ and ‘UsingMSWord’ examples (https://rapisedoc.inflectra.com/Guide/sample_tests/#usingmsaccess-usingmsexcel-usingmsword). If you have Word or Excel you may simply open it from the start page and execute.

Both Word and Excel have very advanced COM API (i.e. https://docs.microsoft.com/en-us/office/vba/api/excel.sheets). One may launch an application, read/write any cell, work with workbooks and sheets, ranges, comments, formulas in both visible and hidden mode.

Many other products have COM and it is worth to know about it.

Direct Database Access

Sometimes simplest way to accomplish the task is to check database. So database access is always good to have.

Rapise has built in Database object to read values.

Also with Rapise Scripting you may use universal Microsoft ADO (ActiveX Data Objects). You may use it to read data from .csv file as well as to connect to MS SQL and Oracle and any ODBC client.

https://docs.microsoft.com/en-us/sql/ado/guide/ado-programmer-s-guide

Rapise ships with ‘UsingDatabase’ sample (https://rapisedoc.inflectra.com/Guide/sample_tests/#usingdatabase), so you may use it as a reference.

Command Line Keys

Some applications have command line mode or at least command line interface. Command line keys may make your life easier:

1. Disable updates.

2. Use clean profiles (browser)

3. Incognito mode

4. Pass login credentials

5. Change operation

For instance, Google chrome has plenty of command line switches: https://www.ghacks.net/2013/10/06/list-useful-google-chrome-command-line-switches/

So, for example, we may launch chrome window in incognito mode, at position 100,100 and having window size 320,960 as follows:

chrome.exe — incognito — window-position=100,100 — window-size=320,960

Auth in URL

Entering authentication details may be a problem. However sometimes it is possible to bypass the auth prompt by just entering auth data as a part of the URL:

https://username:password@example.com/login.html

URL parameters

Sometimes tweaking an URL helps to change an application behavior. For example, Google apps login is language specific. I.e. we just go to the URL:

https://accounts.google.com/

We getting a prompt in the locale specific language:

However if we add a parameter hl=en it becomes English:

So you may change language by just changing an URL.

Entry Point URL

There is always a UI path to open specific item. For example, in google apps you may log in, go to documents, find required document and click on it.

Usually this is possible to just save a link to a specific document. You will anyway be asked to log in, but then you will be redirected to the required page.

In Dynamics 365 CRM you may also login to the dashboard and then navigate to the required site from the ‘main’ page:

Or you may go there directly, if you navigate to right entry point URL:

https://inflectra3650.crm.dynamics.com/main.aspx?appid=d6dec168-cb94-e911-a9c2-000d3a33bcb9&pagetype=entitylist&etn=contact

You will still need to log in, but then you will appear immediately at the destination.

Input Files

For applications working with documents it is common to have a document path as a default command line key.

I.e.

write.exe myfile.rtf

To open file, instead of doing File->Open.

Configuration Files

Always worth to know about it. For example, consider USD. If we look around application executable:

Let’s go to the folder and look around. We see a .config file:

It is our friend. Later it may be useful for us, let’s note it and move on for now.

Sync Items (status bars, busy cursor state, icon, progress bar)

Let’s look at typical situation: search in windows explorer. Suppose, that we need to make sure that search results return expected number of items. So we need to know when the search ends. Our friend in this case is a progress bar. The search proceeds while the green part is feeling the space:

For now it is important to know that there is some indication of the operation. In this case the progress bar is not the best possible way to track the progress, so we will return to this example again in the ‘Foes’ section and find better friends.

Macros, Snippets, Functions

Some applications have internal scripting capabilities. Microsoft products have VBA support, Google Apps have build in scripting and script editor:

Scripting is always good for us. It means that the product has some API (scripts need API) and we may re-use it or even re-use the whole scripts for our purpose.

Logs

Logs are text files. Logs are easy to check. Logs may contain output information. So if application has logs you should better know about it. So you may, for example, know if some exception happened even if it is not captured by the automation script itself.

In addition to validation the log may be a source of information about synchronization. If there is async behavior sometimes log tells us when the operation ends.

Foes

Now we are switching to the dark side. It is only a subset, there are always own unique problems when you are trying to do automation. But the approach we are going to use here to treat them is always the same.

Captchas

Well, for Google Re-Captcha v2 there is a workaround designed for testing:

https://developers.google.com/recaptcha/docs/faq#id-like-to-run-automated-tests-with-recaptcha-v2-what-should-i-do

I.e. pre-defined secret key that you may enter to let the system get in:

For reCAPTCHA v2, use the following test keys. You will always get No CAPTCHA and all verification requests will pass.

· Site key: 6LeIxAcTAAAAAJcZVRqyHh71UMIEGNQ_MXjiZKhI

· Secret key: 6LeIxAcTAAAAAGG-vFI1TnRWxMZNFuojJ4WifJWe

I.e. you should have special (staging) version of the site with site key set as specified here. And in this case your reCaptcha will work. You need to pass this information to the development team. The right site key is a way to enable automation on the staging instance.

Delays, Async Behavior (button event)

Let’s return to the example with the search in the windows explorer:

If we use spy to look at the progress bar we will find that it is not exposing own state via any APIs. Closer look brings better friends: Refresh button, Close search button and status bar.

‘Close Search’ appears only while search is in progress:

And then it is replaced by refresh:

Obfuscation

Obfuscation goal is to make files shorter and also harder to understand. Class names, identifiers are renamed to less readable option.

One more problem with obfuscation is its instability — identifiers and tokens change completely. Minor build may produce major changes in class namings. So we need to use resilient XPath options (for example, see this talk https://www.youtube.com/watch?v=BWqz9t00gMA to learn more about XPath locators).

Canvas

HTML canvas is like a picture and it is a problem for UI testing.

For example, the grid control like this:

https://fin-hypergrid.github.io/core/demo/

Is not visible for UI automation — the whole grid with all its contents and controls is a solid single block. Another well known example of such a grid is Google Spreadsheets:

https://docs.google.com/spreadsheets/d/1uQi9ZIatxNa8iqJV5vU3D3z8jL1AsAt6pZ7zR5vbx0U/edit?usp=sharing

And here we should remember our friends. First, is keyboard. We may access all actual data by using Ctrl+A (select all) and Ctrl+C (copy) shortcuts. If data looks like that (i.e. it is a dense rectangle with values):

then ‘Select All’ selects actual data, no the whole sheet:

So if we copy it the clipboard will contain actual data:

Another friend is REST API. All Google apps have it.

https://developers.google.com/sheets/api/

So we may read, write, export. I.e. access spreadsheet data. It requires some preparations (setting up OAuth credentials and getting API keys). But preparations are one time pain and then you will get a lot of benefits.

Long Scenarios

Do you need to have a long UI test scenario? 30 minutes? 50 minutes?

Let me try to convince you to avoid doing it. What if it appears to be flaky? What if full scenario takes 30 minutes and it intermittently fails near minute 15.

Sometimes we need 3, 5, 10 iterations to fix something. In worst case when we polish something we try it 20, 50 times. Worst case: 50 times * 15 minutes = 750 minutes = 12 hours to nail down an issue.

What if we split this test case into 10 items 4 minutes each (so it is longer in total due to some common logic)? We may figure out which item intermittently fails. Suppose it is item #5. Since it takes 4 minutes and we do 50 attempts then we spend 3.5 hours in the worst case to nail the same issue down. Usually the shorter the case the less attempts is needed. It saves us more than 8 hours of hard work.

Another potential scenario: suppose we need to enter patient data for 50 patients to be able to test a search function. Entering data is essential part of the given test case. But if we do it using UI it may take, say 30 seconds per patient and thus we will get 25 minutes just to initialize the test. This is where it is definitely worth to find out API or DB level entry points. Usually entering data like that using API may be done in a matter of few seconds.

One more use case that affects an execution time is parallel execution. You may need to measure performance for tens or hundreds of users entering data in parallel. If you try to accomplish this task by running many browsers (selenium grid) you will need significant hardware resources and efforts to make it all up and running. Meanwhile there are well known load testing tools (Rapise integrates with NeoLoad) that are designed to make load testing: run for hours, simulate thousands or users, measure all performance metrics and bottlenecks with very minimal hardware resources.

CEF

CEF stands for Chrome Extension Framework. For example, VS code uses CEF under the hood:

Bad news: CEF is used for by applications more and more. Good news: may be configurable.

For example, Consider USD. In some configurations it is rendered using CEF and thus its UI is invisible:

Luckily goggling gives us a link to resource telling that rendering engine is configurable:

https://docs.microsoft.com/en-us/dynamics365/customer-engagement/unified-service-desk/chrome-process?view=dynamics-usd-4.1

And here is our friend, a ‘.config’ file, may help to fix it:

Summary

Whenever you have a problem with AUT you should know that there are always friends around if you know where to look for them.

Sometimes you have to create friends. For example, if Date Picker does not response for keyboard you may require your developers to add this (and refer to available examples where it is supported).

So the more friends you have the stronger you are, as usual. Friends are here around, we just need to know where to look at.

Test Automation Demystified Series

Part 1: From Manual to Automated Software Testing

Part 2: Is Application Ready for Test Automation?

Part 3: Choosing a Test Automation Tool: 8 Features That Matter

Part 4: Friends and Foes of Software Test Automation

Part 5: Codeless Test Automation

Part 6: Scenarios, or Why Some Automation Projects Fail

Part 7: AI in Test Automation

--

--