Automating A11y Testing: Part 3 — Testing Library and Beyond

Published in

John Lewis Partnership Software Engineering

17 min readFeb 14, 2024

Last year I gave a talk on the topic of automating web accessibility (a11y) testing at the John Lewis Partnership Tech Profession Conference 2023 which I have converted into this “Automating a11y testing” article series.

Part 1 of the series covered a number of tools from the Axe suite for static analysis of sites, and Part 2 started to explore some of the non-Axe tooling in the wider a11y test automation space.

In this Part 3 we will start to explore how we can weave accessibility concerns into our day to day unit and integration test automation.

Now remember from Part 1…

It is also key to remember the importance of manual validation and exploratory testing.
At the end of the day it is real people that are visiting and using your site, so it is incredibly important to ensure the quality of their experience, not just to tickbox that automation has passed.

Keeping this in mind, let’s take a dive into something a lot of folks in the front-end world have probably encountered — Testing Library!

Introducing Testing Library

Testing Library is a collection of packages for supplementing test frameworks that allow you to query and assert on your application similar to how your users would use it.

“The more your tests resemble the way your software is used, the more confidence they can give you.”

Instead of testing against abstract components that you write for your framework of choice (like Enzyme and other libraries might do), Testing Library ensures you are testing against the final rendered result of DOM nodes to bring your tests closer to reality.

So what does this have to do with accessibility automation?

Laptop on a desk with code open in an editor. — Photo by James Harrison on Unsplash

The key is in the three groups of utility queries that Testing Library provides:

Queries accessible to everyone

These queries not only reflect the experience of visual users but also those who use assistive technologies to visit your pages:

ByRole
ByLabelText
ByPlaceholderText
ByText
ByDisplayValue

Where you mess up, these queries make it immediately clear that something is amiss, whereas other queries or frameworks may miss mistakes (more on this later!).

Not able to find that button or a crucial piece of copy? With these queries you can be confident when something isn’t right with your code, and that it will be caught with a helpful error message explaining why it couldn’t find what you were looking for and suggestions as to alternative options.

Consider this example React Testing Library test where we have accidentally used the incorrect semantic element for a button:

import React from 'react';
import { render, screen } from '@testing-library/react';

describe('Submit button', () => {
  test('should render a "Submit" button', () => {
    render(<div className="button">Submit</div>);

    expect(screen.getByRole('button', { name: 'Submit' })).toBeInTheDocument();
  });
});

This will result in the following output making it clear that something is inaccessible about our button, in this case because we’ve used a <div> instead of a <button>:

TestingLibraryElementError: Unable to find an accessible element with the role "button" and name "Submit"

There are no accessible roles. But there might be some inaccessible roles. If you wish to access them, then set the `hidden` option to `true`. Learn more about this here: https://testing-library.com/docs/dom-testing-library/api-queries#byrole

Ignored nodes: comments, <script />, <style />
<body>
  <div>
    <div
       class="button"
    >
      Submit
    </div>
  </div>
</body>

How about this example where we have accidentally hidden the button?

import React from 'react';
import { render, screen } from '@testing-library/react';

describe('Submit button', () => {
  test('should render a "Submit" button', () => {
    render(
      <button type="button" style={{ display: 'none' }}>
        Submit
      </button>,
    );

    expect(screen.getByRole('button', { name: 'Submit' })).toBeInTheDocument();
  });
});

Again, unlike other libraries that would probably miss this, Testing Library gives you helpful output to make it clear that the button is still not accessible:

TestingLibraryElementError: Unable to find an accessible element with the role "button" and name "Submit"

There are no accessible roles. But there might be some inaccessible roles. If you wish to access them, then set the `hidden` option to `true`. Learn more about this here: https://testing-library.com/docs/dom-testing-library/api-queries#byrole

Ignored nodes: comments, <script />, <style />
<body>
  <div>
    <button
      style="display: none;"
      type="button"
    >
      Submit
    </button>
  </div>
</body>

By placing the emphasis on querying elements using their accessible role and labels you will naturally start to write components for which you can be confident of the correct element choice and that content will be announced as you expect for all users, no matter how they use your application.

Semantic queries

These queries allow you assert on elements that use HTML5 and ARIA compliant attributes to provide additional context to users:

ByAltText
ByTitle

The use-cases for these are limited — often you can use one of the previous queries to acheive the same thing, for example passing the alt as the accessible name option to the ByRole query. Where your use-case demands it these are still preferable to the next class of query, keeping a focus on the end user and the content they will be presented with.

Test ID queries

Lastly, there is the option to use ByTestId to query elements with data-testid attributes.

These attributes cannot be seen or heard by users in any way so should only be used as a last resort for test cases where matching on the role or text for an element doesn’t make sense.

If you use these queries for testing components that have important value to your users it can be very easy to miss that you’re not using the right semantic element or the text isn’t accessible, giving you false confidence in your code.

For example, let’s consider the submit button again, but using ByTestId instead of ByRole:

import React from 'react';
import { render, screen } from '@testing-library/react';

describe('Submit button', () => {
  test('should render a "Submit" button', () => {
    render(
      <div data-testid="button" style={{ display: 'none' }}>
        Soobmit
      </div>,
    );

    expect(screen.getByTestId('button')).toBeInTheDocument();
  });
});

There are a plethera of mistakes here from incorrect semantics, typos in the copy, and the fact the button isn’t even displayed to any users… but yet this test will still pass!

For this reason you should always prefer the previous accessible or sematic queries.

You can learn more about these queries in the Testing Library docs.

Weaving accessibility into your unit tests

Let’s look at some examples of how we can best use these queries to blend accessibility ideas into our unit tests using the @testing-library/dom package.

~ listed on a website with a No.1 Morello Cherry & Almond Tart added to the trolley.

Example 1: Add to trolley button

Let’s say we had the following button for adding an item to the user’s trolley:

<button type="button" id="add-to-trolley">Add To Trolley</button>

If we were writing a test to assert that it performs the expected action we might do something like:

const button = document.querySelector('#add-to-trolley');

// ... test code to trigger a click on the button and assert
// on the item now being in the trolley.

This will certainly get ahold of the button and allow us to write a test, but it falls down in a number of places:

The id of the button is likely an implementation detail, and certainly not something that users care about. If it were to change the test would fail so we’ve introduced a fragile test!
If we were to swap the <button> for an <a> the test would still pass. Although the behaviour might still kinda work for visual mouse users, you will baffle your screen reader users who will hear that this is a link, and your Safari keyboard users who won’t be able to Tab to the element.
This test fails to take the main user facing information into account — the button text for “Add To Trolley”. It wouldn’t be a happy place if the behaviour you were testing was accidentally on the wrong button because there was a mistake with ids, or if it worked but you had a typo on the button for “Add To Troll” that was missed!
This test would also pass even if the button was hidden using CSS styles 🫣.

With Testing Library we can cover off all these points with a single line swap out:

const button = screen.getByRole('button', { name: 'Add To Trolley' });

// ... test code to trigger a click on the button and assert
// on the item now being in the trolley.

Now not only can we assert on the desired click behaviour but we can also be certain that:

The element is a visible button with an accessible name of “Add To Trolley”.

Example 2: Lazy loaded image

Let’s say we had the following lazy loaded image used on a blog post article tile:

<img
  data-testid="article-1"
  alt="Young West Highland terrier sitting on the floor by a window"
  loading="lazy"
  src="https://unsplash.com/photos/white-long-coated-dog-sitting-on-floor-f5KQq4Wfxg8" />

If we were writing a test to assert that it had expected behaviours such lazy loading, error handling, impression tracking, etc. we might do something like:

const img = document.querySelector('[data-testid="article-1"]');

// ... test code to assert on attributes and behaviours

Again we have successfully retrieved the image element, but we could do better. What if the image is no longer for the first article? What if someone accidentally removed the alternative text in a refactor?

Again we can use Testing Library to decouple us from implementation detail while gaining confidence that the image being asserted on is the real deal:

const img = screen.getByAltText('Young West Highland terrier sitting on the floor by a window');

// ... test code to assert on attributes and behaviours

Note: In this example we’ve opted for ByAltText for demonstrative purposes.
You could also use ByRole with an img role and accessible name to disambiguate between it being an image vs an input, area, or any other elements that can have alternative text.

Not just for unit tests

So is all this goodness reserved just for our unit tests?

Luckily not!

Testing Library boasts a huge number of packages that cover a whole host of different test frameworks from working just with the DOM (as we’ve demonstrated above) to integrating with integration and end-to-end test frameworks such as Cypress:

The eagle-eyed among you may notice that Playwright is missing from the above list.

Don’t panic — in both the Playwright Locators and the new experimental Playwright Component Testing the Playwright APIs include an equivalent set of queries that mirror the Testing Library queries, making it very easy to write in the same style, or even migrate unit test code to end-to-end test code.

Not just queries

Thus far we’ve been focusing a lot on the Testing Library queries and how we can use them to ensure the accessibility of elements in our test automation, but the Testing Library ecosystem also boasts a host of other packages and extensions that can also be used in accessibility testing.

A special mention goes to the @testing-library/user-event package which attempts to provide utilities that simulate the real events that would happen in the browser when users interact.

For example, consider the following which uses the package to type a message into a text input:

const user = userEvent.setup();

await user.type(textInput, 'A message');

// ... remainder of test code

Whereas a lot of testing libraries (or something you wrote yourself) would likely dispatch a single DOM input or change event to the text input, the @testing-library/user-event package goes further and tries to simulate the entire interaction:

A hover event is dispatched to the textInput to simulate moving the cursor over the element.
A click event is dispatched to the textInput to simulate clicking the element, just as a real user would before they start typing.
Characters are then dispatched to the textInput one by one, just as a real user would type, with associated events.

This is far more realistic, and more likely to give you a clear idea how your elements will behave for real mouse, keyboard, and assistive technology users.

Where assumptions are incorrect (e.g. you don’t “hover” with a keyboard or screen reader) the methods are all highly configurable, with a host of lower level APIs for constructing your own interaction flows.

With other useful “convenience APIs” such as user.tab() you can also easily cover off keyboard scenarios for your components the same as you would for mouse based test journeys.

You can learn more about other packages in the Testing Library ecosystem docs.

Testing Library gotchas

As with any tool, there are some potential gotchas with Testing Library to be aware of.

Too much choice

As seen in the second example above, there are often a few ways in which you can use different queries to select the same element.

And with choice comes footguns.

Man staring at laptop deep in thought. — Photo by Wes Hicks on Unsplash

When you are writing your queries take care to be as explicit as possible for your given scenario, while of course being pragmatic. For example, in the first example it might have been tempting to just select on the button role:

expect(screen.getByRole('button')).toBeInTheDocument();

Although we can be confident that we will still be selecting a button element, unfortunately we now have no confidence that the button has the correct accessible name (if any at all) for customers!

Similarly you can be caught out with the likes of ByText:

expect(screen.getByText('Add To Trolley')).toBeInTheDocument();

With this test assertion we can certainly be confident the expected text is on the page, but there is no guarantee that this is actually the button element so there is no protection against future refactors to the code which could see the <button> element replaced by something incorrect such as an <a> element.

As a rule of thumb I would recommend:

Should the element have any semantic meaning that is important to users? Use ByRole with both the role and accessible name.
If not, should the element have text that is conveyed to users? Use ByText or a similar text query with the accessible name.
Otherwise… you should probably challenge why you have the element!

If an element has no semantic meaning and no text content nor accessible label then it’s likely that either:

You should be using a different element.
You need to give your element text content or an accessible label such as alt , aria-labelledby, or aria-label.
You shouldn’t have the element at all!

Testing Library is not a real browser

Although the library does a great job of helping to write accessible components with a user-first viewpoint of writing tests, it does have a shortcoming in that it isn’t an actual browser.

Assistive technologies such as screen readers use a combination of the DOM and accessibility APIs in order to understand an element and convey information to users.

You can learn more about how browsers map HTML elements and attributes to platform accessibility APIs in the W3C HTML-AAM specification.

But much like browsers have varying accessibility compliance and support, Testing Library doesn’t always reflect the role or accessible name of an element accurately. In fact, because Testing Library only has access to the DOM and can’t use these accessibility APIs it has to rely heavily upon some awesome packages such as aria-query and dom-accessibility-api to reflect the WAI-ARIA and ACCNAME W3C specifications, meaning it can easily fall out of sync with the latest standards.

For example, at the time of writing (Feb 2024) the following roles are missing or incorrect in the Testing Library latest stable versions:

code role for the <code> element
mark role for the <mark> element
meter role for the <meter> element
strong role for the <strong> element
paragraph role for the <p> element
presentation role for an <img> element with an empty alt="" attribute
Multiple new element additions for the generic role
…and much more!

Luckily for these roles we should see support coming soon in v10 of @testing-library/dom as I added the missing roles in the latter part of 2023!

As always with open source packages — if you spot something that’s not right raise an issue with helpful comments and steps to reproduce (an perhaps even contribute a pull request with the changes if confident!).

Testing Library also has an active Discord server for asking queries.

Lack of “holistic” user experience insight

Although Testing Library allows you to weave accessibility concerns and testing into your querying of specific elements, as well as covers off some interaction scenarios, somewhere I feel it falls down is “bigger picture” validation of component or whole site accessibility.

Consider the following React example for a product tile on an ecommerce site:

<ProductTile>
  <ProductTitle />
  <ProductImage />
  <ProductPrice />
  <ProductAttributes />
  <AddToTrolley />
  <AddToFavourites />
</ProductTile>

For each individual component you can good confidence that your experience is accessible using Testing Library. However, something I’ve observed with this pattern is that the combination of:

Separation of concerns through components; and
Only introducing accessibility testing through selecting upon specific elements in isolation to each other;

can lead to a naff user experience where the product name gets announced far too many times:

Jaffa Cakes, heading level 2
Jaffa Cakes, link
Jaffa Cakes, image
Current Price of Jaffa Cakes £1.50, Previous Price of Jaffa Cakes £2.00
Jaffa Cakes are vegetarian
Jaffa Cakes are a best buy
Add Jaffa Cakes to trolley, button
Add Jaffa Cakes to favourites, button

Ok “Jaffa Cakes” is reasonably concise here so you might be forgiven, but if the name of your product or article was longer this would potentially be quite frustrating to navigate through for your users.

It is worth calling out that users of assistive technologies such as screen readers don’t just tab through your content bit by bit— they can navigate by headings, links, controls etc. so some repetition may be required to ensure users who jump straight to “Add to trolley” button have all the context they need – check out the WCAG success criteria 2.4.6 and 2.4.9 for some further reading.

Where appropriate, you can likely give your users the benefit of the doubt that they can remember the product they have navigated to without reminding them every step of the way.

Of course mileage may vary, so always validate with your own users — your use-case may be different!

So as it stands, Testing Library unfortunately isn’t quite the silver bullet for building these holistic pictures in test automation — for that you need to look elsewhere or to manual testing.

If you’re interested in how to make tile / card patterns accessible for things like listing products or articles, check out this awesome Inclusive Components article on Cards.

Beyond Testing Library

Given some of the shortcomings of Testing Library let’s briefly look at some options for augmenting our test automation.

Two people sat at laptops collaborating on a piece of paper between them with lots of diagrams. — Photo by Scott Graham on Unsplash

Getting our hands on the accessibility tree

One of the gotchas mentioned above was that Testing Library is having to simulate the accessible role, name, and value calculation because it can’t use browsers’ accessibility APIs to get the actual result.

One option here is to use the Chrome DevTools Protocol (CDP) to query aspects of the accessibility tree. For example in Cypress you could add a command like:

Cypress.Commands.add("getAccessibilityTree", () => {
  return Cypress.automation("remote:debugger:protocol", {
    command: "Accessibility.enable",
  }).then(() => {
    return Cypress.automation("remote:debugger:protocol", {
      command: "Accessibility.getFullAXTree",
    });
  });
});

Credit to Paul Grenier for this approach. Check out this “Automated UI Testing Methodology You Need To Try” article to explore the idea further.

This would allow you to query or simply snapshot the accessibility tree for your page as a means of testing and validating the actual computed accessible roles and names.

It also gives you some insight into how your component is seen “as a whole” from the perspective of assistive technology users which may help our Jaffa Cakes verbosity problem from a previous example.

You can find out more about the Accessibility options on the Chrome DevTools Protocol docs.

Looking to other testing frameworks, the likes of Puppeteer and Playwright both have a built-in accessibility class that allows you to get a snapshot of the accessibility tree in a similar fashion to our hand-rolled Cypress command above:

const snapshot = await page.accessibility.snapshot();

console.log(snapshot);

Playwright has deprecated this API in favour of promoting Axe for accessibility testing instead — depending on your use-case, this may well make more sense for you.

You can find out more about the Accessibility class on the Puppeteer docs and the Playwright docs.

Using the Accessibility Object Model

The Accessibility Object Model (AOM) is an emerging JavaScript API that will allow developers to access and modify the accessibility tree for a page.

This effort aims to create a JavaScript API to allow developers to modify (and eventually explore) the accessibility tree for an HTML page.

It is currently in early stages with an unofficial draft specification and limitation implementation or usage outside of things like Web Platform Tests.

Nevertheless, this is exciting territory — given this will allow us to access the computed accessible role and name for an element, I can see a future where these APIs get incorporated into the likes of Testing Library!

If you want to make use of these APIs today in your own projects there are some experimental flags that can get you some of the way. In Chromium there is an --enable-blink-features="AccessibilityObjectModel" flag which enables an experimental implementation — tread carefully here as it could be removed any time, the equivalent WebKit experimental flag appears to have already been removed.

Alternatively there is also an --enable-experimental-web-platform-features flag for Chromium browsers that exposes the computedRole and computedName properties on elements which Paul Grenier covers in this “The Automated UI Testing Methodology You Need To Try (Pt. 2)” article.

At the moment the safest path to follow is likely the approach used in the Web Platform Tests which make use of WebDriver commands (think Selenium) in their test driver, as Chrome, WebKit, and Gecko drivers all have support as of early 2023.

Some care needed here: If you explore around you might come across the likes of getComputedRole() method for WebDriverIO. Unfortunately despite looking like it is using these new APIs, at the time of writing it is actually using the same aria-query package as Testing Library and so suffers from the same inaccuracies. Take care not to muddle the WebDriver protocol with the WebDriverIO test framework!

Given this is all very new and experimental, unless you are a test framework maintainer I would recommend relying on the likes of Testing Library for now with the understanding that there may be some role or accessible name inaccuracies — manual testing FTW!

A missing puzzle piece?

With all the tools we’ve covered in these first three parts we’re starting to be in a really awesome position for building out test automation that can really give us confidence on the accessibility of our applications.

However, something the tools we’ve explored still can’t help us with is being able to automate tests for asserting on the quality of user experience for those who are using your application with something other than mouse or keyboard.

Using Testing Library or other packages such as cypress-real-events can allow us to build reasonably realistic tests that can use mouse or keyboard similar to how real users might, but this still leaves a gap for assistive technologies such as screen readers.

Imagine you are building out components that require the coordination of multiple elements such as combobox or disclosure patterns, then Testing Library and other tools can only get you so far:

Static analysis can catch any common gotchas and incorrect syntax.
Testing Library queries or Accessibility Tree snapshots can give you confidence on usage of the correct accessible roles and names.
@testing-library/user-event and similar packages can give you confidence for mouse and keyboard.

But nothing other than manual testing can really provide confidence that the combination of a browser + screen reader + your application is being announced and behaving in the expected and intuitive way for a screen reader…

Or so it used to be the case!

In the next part of this series we’re going to start exploring the cutting-edge field of screen reader standardisation and automation, looking into:

The W3C ARIA and Assistive Technologies (ARIA-AT) Community Group.
On-demand, remote screen reader testing solutions.
Virtual and real screen reader test automation with packages such as @guidepup/guidepup.

Till then folks — see you soon! 👋

Hi, my name is Craig Morten. I am a senior product engineer at the John Lewis Partnership. When I’m not hunched over my laptop I can be found drinking excessive amounts of tea or running around in circles at my local athletics track.

The John Lewis Partnership are hiring across a range of roles. If you love web development and are excited by accessibility, performance, SEO, and all the other awesome tech in web, we would love to hear from you! Find our open positions here.