Observability in Accessibility Testing

Vít Kostřica
tech-gwi
Published in
7 min readDec 8, 2023

The popularity of accessibility testing is rising. Manual accessibility testing is necessary these days, but we also have options to use automated tools. There is various selection of them. We chose Axe from Deque, as it can cover up to 57% of issues. To utilize its potential, we need to keep results traceable and presentable. For this purpose, we have created a few ways to enhance accessibility results while using company infrastructure and assets.

Why?

Our application is built with different MFEs (Microfrondends) in React with the Storybook support to utilize designs and provide common components to every MFE part. There is manual accessibility testing from designers, but many more accessibility bugs can occur in code later. So we need a way to observe in which part of the chain the bugs are and report them as soon as possible.

MFE Structure

Integration of Axe

Our E2E tool of choice was Cypress and we already have E2E tests running in every step of the chain in the picture above. Therefore we wanted to integrate Axe into Cypress to keep our processes united.

There is a nice community plugin called cypress-axe, that allows us to combine Axe with Cypress into one solution for testing web content for accessibility.

While this community plugin has a good selection of options, it is focused mostly on providing results to technically skilled users, who are familiar with terminal results or familiar with Cypress.

Default log

Our goal is to present these results to a broader audience of designers, product managers, and other interested stakeholders. For this purpose, predefined settings for reporting and custom logging were not enough to track down issues.

Enable the possibilities

A custom logger from Cypress Axe seemed to be a good choice to start with. But we needed to expand it with more functionality.

Screenshots — For easier debugging of axe issues, we needed to have screenshots. This was quite easy to achieve with cy.screenshot(). The question was where to put it. For this, we needed to loop through all violations and add them inside the loop. But with just a screenshot, we didn’t recognize much. We needed to know, where exactly was the issue located. So we used the target property of the violation (selector), grabbed it with Cypress, and changed the CSS border of the element to a different color. After we created the screenshot with the highlighted area, we used the same CSS to change the border back (to some neutral/white color). As this element was already tested by Axe, we didn’t create any false positives by messing with the border.

cy.get(String(target)).then($element => {
//add border
$element.css('border', '4px solid cyan')
cy.then(() => {
//take screenshot
cy.screenshot(`${Cypress.spec.name}/accessibility/A11y_${axe_page}_${violation.id}(${targetString})`)
cy.then(() => {
//remove border
$element.css('border', 'none')
//copy screenshot to axe folder
cy.task('makeDir', createDir)
cy.task('copyFile', screenshotPath)
})
})
});
Screenshot with highlight

Data — The next step was to generate files from the output of the violations. We wanted to get all results from Axe and some custom strings for identification (like where we are in our app, unique identifiers, and screenshot paths). Using cy.writeFile() we could easily store them as JSON files.

 violationExport.push({
identifier: axe_page,
summary: `A11y_${axe_page}_${violation.id}(${targetString})`,
squad: axe_squad,
impact: violation.impact,
environment: axe_environment,
testURL: currentURL.replace('?headless-test=1', ''),
id: violation.id,
target: String(target).split(" ").pop(),
description: violation.description,
help: violation.help,
helpURL: violation.helpUrl,
screenshot: screenshotPath.new,
gcloudScreen: `${axe_squad}/axe_screenshots/A11y_${axe_page}_${violation.id}(${targetString}).png`
})

=>

cy.writeFile(`cypress/results/axe_data/axeData-${Cypress.spec.name}.json`, violationExport)

Easy review with HTML reports

As we planned to run accessibility tests on pipelines in parallel execution, we needed to see all results at a glance.
To achieve this goal, we needed to gather all data files. Each parallel run was uploading results to Gcloud, so in the next step, we needed to download all of it to a new container. With a custom script, we could read all the available data, do some counting, and generate a clear table with all the results. With HTML and CSS code, we created following table:

HTML Report

So in this HTML, we could check all issues with enlargable screenshots. We could access the report directly from the pipeline summary via a shortcut, or push the link to Slack notification.

Tracing with automatic JIRA tickets

The next step was to trace issues. Not all stakeholders had access to pipelines or tracked files in Gcloud, therefore we needed to use a common tool to track bugs. For our use-case the tracking tool was Jira.

With the help of the jira-client library, we could make really simple API calls to JIRA. We could create tickets, edit them, add comments, and many more with simple-to-use commands inside a script. We took all issues, put them in a loop, and reported them one by one. Before opening a ticket, it was necessary to check, if this ticket already exists to avoid reporting duplicates. Here is the logic:

Ticket creation logic

Inside a ticket, we set a title, description, labels, and priority based on importance level, etc. All was reported by the QA-bot account.

Jira ticket

On top of that, we wanted to inform the teams whenever a new failure came up in the pipeline. Therefore Slack Jira bot needed to be configured to properly update team’s channel.

Slack notification

History with Grafana

The point of this was to get historical data of all accessibility runs to one place. Visualizing data through Grafana could help us to keep history between releases and cultivate an observability culture within our teams.

Once again were reading our data, counting issues, and pushing them on the pipeline as variables with numerical values for each importance. In the end, we had a number of critical, serious, moderate, and minor issues prepared for the next step.

For getting the data into Grafana, we needed a middleman .. for our case, it was Prometheus Pushgateway as it is a part of our existing infrastructure. Prometheus scrapes available endpoints for exposed metrics. It was not an ideal solution, but with DevOps support, we made it work. To identify what belongs where we needed labels — for app, team, importance, and environment. We were also pushing the app and the importance as a job name to separate results and prevent overwriting inside Pushgateway.

- name: Send CRITICAL metrics to Prometheus
if: ${{ steps.prometheusPrepare.outcome == 'success' }}
uses: GlobalWebIndex/github-actions/prometheus-pushgateway/python@main
with:
function: pushadd_to_gateway
metric_name: qa_accessibility_workflow
metric_value: ${{ env.CRITICAL }}
label_names: "environment,app,importance"
labels: "staging,${{ inputs.squad }},critical"
job: qa_axe-${{ inputs.squad }}-critical
Pushgateway

With data available on Grafana, we could start creating Dashboards. To do so we needed to query data (selecting what we wanted to show).

A few graphs were useful here.
1. Time series — Total issues by importance (total critical, total serious ..)
2. Time series — Total issues per app (total for App1 , for App2 ..)
3. Time series — Per app by importance (total critical per app, ..)

Let's see an example of 1. We had multiple scrapes per one result and needed to filter it out.
1. Select our metric name (bucket with all results)
2. Filter it by label IMPORTANCE — critical
3. Select MaxBy - App to get max results from each app
4. Sum all results to get the total number of Critical
5. Create a query for every importance level

Grafana query

Now with the created graph, we could make it nicer by adjusting a few settings like colors to achieve this:

Grafana timeline graph for Accessibility

Summary

We integrated our solution in every step of MFE architecture, so we know where and when the issue was created, and if it was or wasn’t fixed.

Developers can use reports of their work right away from PR runs. They can decide if they want to fix it or keep it for later in the form of a Jira ticket.

We can also observe the result in Grafana to see if issues are being fixed or new ones are introduced.

Observability now

--

--