XPath-finding in Test Automation

Team Merlin
Government Digital Products, Singapore
6 min readMay 24, 2024

In any test automation, the objective is to simulate user behaviour by interacting with the application elements and verify the results. It usually involves locating a specific element on the application and performing an action on it (e.g. clicking on the login button). Finally, a verification is done by asserting an expected output (e.g. “Welcome” message is displayed).

Throughout the automation test, there will be countless instances whereby a specific element needs to be located, either to perform an action on or to assert that it is displaying correctly. One of the most common and effective ways to identify elements in a web application is by using XPath (XML Path Language), a syntax for dynamically finding any element in the HTML DOM structure of a webpage.

Image generated with Microsoft Designer Image Creator

XPath Basics

Think of the structure of a HTML webpage as a vast forest with lots of plants and animals. To find a specific tree, you can make use of its position and unique attributes. The basic format for an XPath expression is as follows:

The tagname can be replaced by a wildcard “*” to cover any tag.

There may be multiple similar tags in one webpage, so the attribute of the tag can be used to identify a particular element. For instance, a header tag with ID attribute “Merlin” can be located using:

Quickly get the XPath of any element

It can be tedious to look through the HTML of a webpage to determine the exact XPath to be used. Fortunately, modern browsers have a way to quickly get the XPath of an element (i.e. inspecting an element in the webpage).

Right-click on the chosen element in the browser and select “Inspect (Element)”

The browser developer tool will then launch with the element in HTML DOM highlighted. Then you can right-click to copy the XPath easily as shown below.

The copied XPath is usually a relative path, meaning it will identify an unique element using the ID attribute and add any subpath relative to it for locating the element. Absolute path is used if there is no unique element identified with ID and usually begins with “/html/body/<path>”. The advantage of using an exact locator without any path is that when the HTML structure changes, it will still work without updating the path.

Using “text”, “contain”, and “starts-with”

Sometimes, it is difficult to get unique attribute to identify a specific element directly, especially when the ID attribute is not added or the element’s properties keep changing dynamically on every page load. A useful alternative is to use the text in the element directly.

Similarly, contains and starts-with can be used in cases where the text or attribute is dynamic, as long as a certain fixed keyword can be found in either the text or a specific attribute.

Dealing with multiple hits

Frequently, an XPath may return multiple hits. Here is a way to select the nth index of the results, replacing n with the instance number of the occurrences:

The last function can be used to get the results that are the last child node, such as:

Operators AND and OR can also be used to further filter out irrelevant results, such as:

Operator NOT can be used to inverse the search to find all other elements that don’t meet the search criteria, such as:

Using Axes

Another way to locate an element is to make use of its relationship relative to another node in the tree. One example to select the parent node of a child is as follows:

This will locate the section tag where the text “Team Merlin” is the child node of. This demonstrates the capability of XPath to scan upwards. There are also many other axes relations which can be explored further at this link.

Handling Shadow DOM

Shadow DOM allows hidden DOM trees to be attached to the regular DOM tree, allowing encapsulation of scripts and styles so as not to affect anything outside of it. Chatbot or video player may take advantage of this feature, and can be embedded across different applications.

The bad news is that normal XPath does not work for Shadow DOM elements. In order to select Shadow DOM element, another method called CSS Selector is used in conjunction with the shadow query selector. The good news is that CSS Selector works very similarly to XPath, with just a syntax difference.

Given this Shadow DOM structure:

The following makes use of CSS Selector and the ‘shadowRoot’ property to locate the element in Shadow DOM.

XPath vs CSS Selector

Given the above limitation, does it mean that CSS Selector is better than XPath? In general, CSS Selector is simpler and more readable in syntax. Its performance is also faster than XPath as it is unidirectional in scanning.

On the other hand, XPath can handle more complex DOM structure as it is more customisable with all the built-in operators and functions. It is also backward compatible with older browsers, and is a valuable language supporting any XML-like structure. Therefore, both are useful tools depending on the needs of the task on hand.

XPath and CSS selector working hand-in-hand (image generate by Microsoft Designer Image Creator)

Finally, all those path-finding above have led to our final destination. However, it is just the beginning as locating elements is just one part of test automation. Hope you can continue your testing journey and expand your knowledge, along with an arsenal of tools to deal with all the challenges out there in the jungle!

Do share and discuss in the comment section below on your experiences with XPath, CSS Selector or any other test automation-related experiences. Stay alert out there and keep going!

🧙🏼‍♀Team Merlin 💛
Application security is not any individual’s problem but a shared responsibility.

--

--