Visualizing for the Non-Visual

Enabling the visually impaired to access charts on the web using deep learning.

Accessing data on the web is often challenging for visually impaired users (photo by David Travis on Unsplash).

The fact that visualization leverages the human visual system to convey data is not only inherent in the name “visualization” itself, it is also endemic to the discipline. A huge part of the visualization field concerns itself with visual design guidelines, graphical perception studies, and color theory. Furthermore, much of human thinking is couched in visual terms, in that to understand something is called “seeing it,” in that we want to bring our thoughts into “focus,” and in that we strive to make our ideas “clear.” But what if all you have are words and no pictures; that is, what if you are visually impaired? Is the power of visualization forever closed to you, or worse, are you actively barred from accessing important data about our world?

Our deep learning pipeline for extracting data from raster-based charts in order to display using a screen reader, revisualize using a new (or interactive) representation, or index for search engines.

This is particularly true on the internet. The web has had a revolutionary impact on improving information access for the visually impaired, who use so-called screen readers to transform a visual display to text, sound, or Braille. However, besides textual content, the web also holds hundreds and thousands of charts stored as images. While a photograph can manually or automatically labeled, screen readers do not work well for this kind of data-rich images. Since images are just collections of colored pixels, encoding data in maps, line graphs, and barcharts essentially means locking away the data for all but sighted users. Furthermore, most websites do not contain the raw data that generated these charts. While it is true that accessibility standards are on the rise, there is a vast collection of legacy charts on the web where no such data will ever be made available.

We tackle this problem in our recent work, “Visualizing for the Non-Visual: Enabling the Visually Impaired to Use Visualization”, where the idea is to have the user’s web browser automatically detect charts encoded as images. For any such image, the browser will send the image to a server which interprets the chart encoded in the image and returns the raw data. The browser then replaces the chart image in the webpage with a table, which can be directly navigated using a screen reader, indexed by a search engine, or rendered anew with an interactive visualization. The tool is currently available as a prototype extension to Google Chrome (see picture below).

Our Google Chrome extension that has automatically transformed a line graph stored as an image (top), sent back the resulting raw data table (middle), and then visualized the data anew in an interactive chart.

Our work is motivated by in-depth interviews with three visually impaired users, who all accessed the web using screen readers. They all expressed frustration over the current generation of screen readers, which are unable to explain the contents of images, and all three had adopted a practice of rarely paying attention to images. Instead they relied mostly on textual content. This was particularly frustrating when they had to engage with statistical and data-intensive material. One or two had used refreshable Braille displays or sonification, but were overall not satisfied with any of these.

Example deep learning pipeline for a barchart: input, classification, extracting labels, identifying objects, and extracting the actual raw data.

Deep learning approaches such as ours use so-called neural networks that are trained to generate specific outputs given specific inputs. Our tool was trained using both a labeled chart dataset (FigureQA), as well as our own dataset of images collected from the web. Once trained, our model first detects the chart type and then chooses different methods depending on whether the chart is a barchart, piechart, or line graph. The figure above shows the step-by-step process for a barchart.

To check the accuracy of our model, we compared the generated results to a holdout part of the dataset. Our results were consistently better than existing tools, yielding an extraction rate of approximately 88%, compared to around 50–75% for competitors. We also let our group of visually impaired users engage with the new tool. They were excited about the idea and hoped to see functionality such as ours be turned into plugins for screen readers. Participants also stated that our tool would open up entirely new applications for them, particularly for online education.

In summary, despite its reliance on a high-performing visual system, visualization has paid scant attention to visually impaired users: with the exception of color vision deficiencies, visual impairment is truly the elephant in the room. Our work is by no means the final word on this topic, but it is a beginning.

--

--

Niklas Elmqvist
Sparks of Innovation: Stories from the HCIL

Professor in visualization and human-computer interaction at Aarhus University in Aarhus, Denmark.