Accessibility as philosophy of perception

Zenobia Gawlikowska
EcoVadis Engineering
12 min readDec 3, 2022

“The body is our general medium for having a world.”
― Maurice Merleau-Ponty, Phenomenology of Perception

Camera obscura — device to see a reflected image. Man in a box looking at the reflection of light arriving through an orifice in the wall.

When looking at accessibility in the context of an application, we have to consider the ability of the user to navigate the world it presents. This journey, however, does not take place on the screen, or the loudspeaker, and is not driven by reflections of light or vibrations of the air. It is not interacted with using the keyboard or mouse. Instead, it does take place in the mind, as the user grasps the perceptions available to them and turns them into mental objects they can work with.

The limits of visual representation

“The plain man is familiar with blindness and deafness, and knows from his everyday experience that the look of things is influenced by his senses; but it never occurs to him to regard the whole world as the creation of his senses.”
— Ernst Mach, The Analysis of Sensations

As a web application presents itself to the Web Browser, it is processed from the Document Object Model¹ into a visual representation, featuring familiar UI affordances², which indicate their semantics and modes of action to the seeing user. A sighted user will quickly scan the page with their eyeballs, in order to identify headings indicating the content hierarchy, call to action and secondary action buttons signifying important and less important tasks to be performed. Input controls will suggest with their visual form what data is expected in them. The perception of red will alert the user with color vision of the existence of errors to be corrected. Grayed out content will indicate an action that is unavailable or unimportant at the moment. All those objects and possible actions will be impressed in the user’s mind by way of a visual medium. For the lifetime of the user’s experience with the application, those objects will constitute the user’s world.

While extremely powerful and engaging, the visual representation can only go so far. It relies on the sightedness of the user, their ability to distinguish shapes, contrasts and colors in a consistent and sustained way, across all devices, situational contexts and medical conditions of the body. The Document Object Model requires other modes of representation that are not just visual and indeed such modes exist. An intermediate layer, called the Accessibility Tree³, is provided by the Web Browser with the intention of providing a robust representation of what exists in the web document and what can be done with it. This information is compiled from the native HTML element semantics or custom ARIA⁴ roles and attributes.

Example of an accessibility tree with form elements and their attributes made available to the API
Fragment of an accessibility tree. Source: https://wicg.github.io/aom/explainer.html

Alternative representations of content

The fact that the normal subject immediately grasps that the eye is to sight as the ear is to hearing shows that the eye and ear are immediately given to him as means of access to one and the same world.
― Maurice Merleau-Ponty, Phenomenology of Perception

Screen Readers, using the Accessibility Tree, provide an auditory representation of the contents that is otherwise available to sighted users visually. However, the medium, be it visual, auditory or tactile (via braille displays⁵), is not what matters — the essence is the ability of the user to form a mental model in their mind, a model that contains all of the meaningful objects present in the application and all of the actions that can be performed on them. All of this regardless of what physical channel or what sensory pathway was used to make the impression that is perceived by the user’s mind.

Flowchart showing the progression from HTML to DOM and visual rendering and to the Accessibility Tree and assistive rendering. Everything ends with the user.
Information transfer between the web document and the user’s mind. Source: https://medium.com/@krishnansai99/what-is-accessibility-testing-7d04256a3919

In order for the application to be accessible, the ability to visually scan the contents of the page to take note of headings and their relative importance must be matched by the ability to quickly cycle through heading types, as they are announced by the screen reader. Visually distinct regions corresponding to navigation, search and main content must be discoverable as such using hearing. An idea of the number of elements in a list needs to be available in both auditory and visual form. Control elements must not only have a visual representation of their function and state, but must also have an auditory representation of all their relevant aspects.

In the end, different neural pathways and different perceptions will lead to the formation of an equivalent mental representation of the web application and all of its components.

Example from https://a11y-101.com/development/landmarks

Perception of colors

“I will never know how you see red and you will never know how I see it. But this separation of consciousness is recognized only after a failure of communication, and our first movement is to believe in an undivided being between us.”
― Maurice Merleau-Ponty, The Primacy of Perception

„We have now to consider the fact that colours are produced in the eye by means of colourless objects”
— Johann Wolfgang von Goethe, Theory of Colours

Objects have no colors in themselves and their perception is entirely dependent on the eye of the beholder. In fact, the structure of the eye determines the range of colors that can be perceived. Cone cells provide the eye with the capability to perceive different light wavelengths and constitute the dimensions of human color vision⁶. Depending on the number of cones present in the eye, the same image will be viewed differently. Normal vision is characterized by 3 cones (trichromacy), while other color vision types exist (dichromacy⁷ and its different kinds: protanopia, deuteranopia and tritanopia). Monochromacy means a lack of color perception.

Colorful spice, as seen by people with normal vision, or different color perception deficiencies.
Source: https://www.colourblindawareness.org/colour-blindness/types-of-colour-blindness

This phenomenon has been noted by Goethe in his exploration of the science of colors. He described the experience of dichromatic perception:

“If the carmine was passed thinly over the white saucer, they would compare the light colour thus produced to the colour of the sky, and call it blue. If a rose was shown them beside it, they would, in like manner, call it blue; and in all the trials which were made, it appeared that they could not distinguish light blue from rose-colour. They confounded rose-colour, blue, and violet on all occasions: these colours only appeared to them to be distinguished from each other by delicate shades of lighter, darker, intenser, or fainter appearance.”
— Johann Wolfgang von Goethe, Theory of Colours

It occurs, exceptionally, that humans might be endowed with tetrachromacy, and be able to see even more colors than are accessible to most⁸.

It is important to understand that the perceived color does not depend on the object and its intrinsic color (which does not exist as such), but instead on the constitution of the eye. Nevertheless we are speaking of colors, as if they were a property of the objects we are able to observe.

Some tools might be helpful for partially color-blind people. Augmented reality filters⁹ exist for smartphones and can, for example, let people who have trouble distinguishing “red” see the objects to which other people refer as “red”. That allows them to use the concept of “red” in pointing to objects, regardless of their differences of perception.

Two buttons, colored green and red, as perceived by the average human eye
Objects as seen in standard color vision (“red” can be distinguished from “green”)
Two buttons, which cannot be distinguished by color
Objects as seen without “red” vision (the two objects cannot be distinguished by color)
Two buttons, which can be distinguished by color, as seen by a partially color-blind person using augmented reality
Objects as seen through an augmented reality filter (“red” is distinguishable again, even though it appears not to be “red” to the average eye)

Users with no color vision at all can still perceive objects designated in a visual way, provided such an option is made available.

Two maps, one with colored areas, the other with striped areas instead of colors.
Source: https://liveuamap.com/

Orientation in space and time

In the wood there are paths, mostly overgrown, that come to an abrupt stop where the wood is untrodden. They are called “wood paths.” Each goes its separate way, though within the same forest. It often appears as if one is identical to another. But it only appears so.
— Martin Heidegger, Wood Paths

All the objects present in the user’s mind, all parts of the application such as forms, buttons and inputs, live in a mental space with its particular topography derived from perception. Items could be mentally arranged from left to right, from right to left, top-down, or in a linear sequence determined by the tabbing order¹⁰. They could exist along independent dimensions, navigated by way of a list of headings, landmarks or links. It can never be assumed that what appears to be “left” relative to some other object will indeed be on the “left” of another user’s mental map, or that “left” even has a meaning to that person. Therefore, any instructions presented to the user cannot include hints referring to a particular space-orientation (such as: “look at the description on the right”), as they would be meaningless in a context other than visual and other than left-to-right writing mode.

Consider the position of “footnotes” depending on the text orientation and locale (in this example on the left of lines of text arranged vertically from top to bottom and from right to left, and not at the bottom, as implied by the name “footnotes”):

Book with text in Japanese, with vertical lines reading from right to left and “footnotes” on the right side of the page.
Source: https://omoi-no-hoka.tumblr.com/post/190599839989/hello-thank-you-for-sharing-your-enthusiasm

As human beings, we experience in sequence, going on a path through time. Those paths can move along different axes, however the order with which information is accessed visually needs to match the order in which information is presented by other means, such as the sequential tabbing order.

As Manuel Matuzovic has noted: “If a blind user is working with a sighted user, who reads the page in visual order, it may confuse them when they encounter information in different order.”¹¹

The order of elements has implications in internationalization. As space referenced as “left” might signify different areas depending on the orientation of the text in a given culture. However, visual space can be manipulated in a way that is not dependent on the current spatial orientation. For example, it is possible to use CSS rules, such as padding-block-start instead of padding-left, in order to control space¹²:

Text in right-to-left and top-to-bottom writing mode with padding-block-start appearing before the beginning of the text, as required.
Source: https://www.cnblogs.com/coco1s/p/15033565.html

The concepts of start and end will always make sense in the mind of the user, regardless of how the mental map is constructed from their perceptions, what axis is used to position objects, or what is their cultural background and the sensory perceptions available to them. That is because progression through a path in time is always intelligible, as human beings are immersed in time and, as Heidegger would put it, being is time.

Detecting and remembering change

“each moment of time calls all the others to witness”
― Maurice Merleau-Ponty, Phenomenology of Perception

“time is an abstraction, at which we arrive by means of the change of things”
― Ernst Mach,
The Science of Mechanics

Change is only experienced by way of a comparison with a previous state. This needs to be made apparent to perception and the most universally intelligible indication of change is movement. That is why animations are powerful additions to user interfaces¹³, bringing the user’s attention to changes which occurred and providing a more immersive experience in a world that responds to actions. Patterns for animation¹⁴ simulate the natural movement of physical objects in space, which is familiar to any sighted person, thus creating a more intuitive and immediately intelligible experience of change. While animations, when used sparingly, can add a lot to the user experience, the users should have the possibility to opt-out of animated interfaces, as excessive perceived movement can cause headaches, loss of orientation, or even seizures. This can be accomplished by disabling animations for users who prefer reduced motion¹⁵.

Time will not be perceived in a uniform manner by all people and the display duration of messages on the screen may be considered irritating if it is too long, or stressful if there is not enough time to read it. This is typically an issue with so-called toast messages displayed on-screen during a certain period of time. As Sheri Byrne-Haber writes: “To account for memory loss and distraction as well as disability-related issues such as ADHD, a best practice would be to implement a location where users can refer to a list of past toast messages which have come and gone.”¹⁶

A non-sighted user will not be able to perceive dynamic updates of parts of an application unless these are specifically announced by a screen reader. In order to make sure changes are indicated in an audible way, it is necessary to create live regions¹⁷, which can be “polite” or “assertive” in the way they interrupt the current narrative or wait for it to complete before reading the changed contents. This comes in handy for announcing validation errors and status updates as well.

Users with cognitive disabilities relating to memory might require a reminder of the past actions and choices, even if those were made just a short moment ago. This is necessary to reduce their cognitive load and reinforce their feeling of being in control. This can be accomplished by providing a summary of the past steps already completed in a multi-step form, as in this example of a checkout screen:

Checkout screen with a summary of the items ordered
Source: https://www.toptal.com/designers/e-commerce/ecommerce-ux-best-practices

Modes of interaction

“the subject, when put in front of his scissors, needle and familiar tasks, does not need to look for his hands or his fingers, because they are not objects to be discovered in objective space: bones, muscles and nerves, but potentialities already mobilized by the perception of scissors or needle, the central end of those ‘intentional threads’ which link him to the objects given”
― Maurice Merleau-Ponty, Phenomenology of Perception

[The] less we stare at the hammer-Thing, and the more we seize hold of it and use it, the more primordial does our relationship to it become
— Martin Heidegger, Being and Time

Our ability to interact with the world determines our ability to understand it. As the goal of any application designer is to make the application understandable by the user and all its functions discoverable, it is of paramount importance that all modes of interaction (mouse, finger as pointer, keyboard, voice controlled activation¹⁸, switch devices¹⁹ can be used to perform all necessary actions. This means that information cannot be hidden behind a type of interaction that is not universally available, such as the hover action of a mouse pointer. The hover concept does not make any sense for someone using a touch-screen, keyboard or switch. Any piece of information or action, such as a tooltip or a hover menu, needs to be available when an equivalent interaction, such as focus, is activated by the user.

Actions such as swiping and dragging, while immediately meaningful to the user with the appropriate motor skills, might prove challenging to users who are not able to perform precise movements with their hands and fingers and do not possess fine-grained eye-hand coordination. The touchpad tapping function, very intuitive for manually proficient users, might generate unwanted clicks instead of just moving the pointer if the finger movement is not precise enough. In order to eliminate the strain on users with limited manual dexterity, there needs to be an alternative interaction mode, where the process is decomposed into many consecutive and isolated steps that do not require simultaneous pressing of several surfaces or buttons and are not limited in time²⁰.

Finally, users should be able to locate and interact with elements just using their voice. That means that for sighted users the visible name should be the same as the name that can be interacted with the voice, so that the experience is uniform across all sensory pathways. It might not always be the case, and the appropriate labeling practices need to be respected²¹.

Interaction should be as frictionless and effortless as possible. The tools used should not be obstacles to be overcome, but parts of our bodies put into motion by our intentions. Tools should support all the agility that we possess and be as forgiving of disabilities as possible, in order to create a seamless experience where the tools used are not even a concept in the user’s mind.

Conclusion

Accessibility aims to provide the same mental map, the same awareness of objects and their importance, the same idea of relationships between objects and the actions afforded by them, regardless of what sensory apparatus and neural pathway has been used to create lived experience. While every person’s mental world is different and mediated by different sensations, it allows the interaction with our environment and the use of language to refer to the same objects when communicating with other humans. It does not matter that these objects were perceived differently and independently.

References

[1] https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction

[2] https://uxplanet.org/ux-design-glossary-how-to-use-affordances-in-user-interfaces-393c8e9686e4

[3] https://web.dev/the-accessibility-tree/

[4] https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA

[5] https://en.wikipedia.org/wiki/Refreshable_braille_display

[6] https://en.wikipedia.org/wiki/Color_vision#Dimensionality

[7] https://en.wikipedia.org/wiki/Dichromacy

[8] https://www.bbc.com/future/article/20140905-the-women-with-super-human-vision

[9] https://play.google.com/store/apps/details?id=com.areyoucolorblind.nowyousee&hl=en&gl=US

[10] https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/tabindex

[11] https://www.matuzo.at/blog/the-dark-side-of-the-grid-part-2/

[12] https://developer.mozilla.org/en-US/docs/Web/CSS/padding-block-start

[13] https://www.nngroup.com/articles/animation-purpose-ux/

[14] https://web.dev/the-basics-of-easing/

[15] https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-reduced-motion

[16] https://sheribyrnehaber.medium.com/designing-toast-messages-for-accessibility-fb610ac364be

[17] https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/ARIA_Live_Regions

[18] https://youtu.be/V-QkjOsV2WM

[19] https://www.youtube.com/watch?v=1AfbGQ2DYjg&ab_channel=ChristopherHills

[20] https://medium.com/salesforce-ux/4-major-patterns-for-accessible-drag-and-drop-1d43f64ebf09

[21] https://ericwbailey.website/published/aria-label-is-a-code-smell/

--

--