When You Need More Than a Thousand Words

Natural language to support data visualization.

Alea Oakman
VisUMD
4 min readOct 26, 2021

--

They say that a picture is worth a thousand words. The beauty of data visualization is just that. The audience can take one look at the image and very quickly get a high level understanding of a system without tracking the all the data in their own mind. However, sometimes these visuals just aren’t enough to get the whole story across. The article “Once Upon A Time in Visualization: Understanding the Use of Textual Narratives for Causality” explores using textual narratives in conjunction with visualizations to represent causal models.

The authors first try to understand the area by outlining a potential process for generating and communicating a narrative — the written story — from the data. This pipeline begins by pulling out the causal information from the data set. This requires determining the type of causal relationship and descriptors for the data. The next step is to input that information to a text generator to write sentence clauses to describe the relationships. Next, a computer program determines the order in which to present the information based on priority as well as which sentence clauses can be aggregated. After this, the program chooses which channels to use (such as text, color, font, etc.) to display the text, and then lastly, how the viewer will interact with the narrative (are they able to click on things, scroll through, etc.)

Creating this process to generate text directly from the data eliminates opportunities for the story to be biased by the analyst’s interpretation. Since it is derived entirely by the computer, it is less likely to draw conclusions based on what will be “popular” or well received. Setting up this design space for text generation provides a foundation to move forward with the study.

Next, the authors conducted a study to understand how narratives change interpretation of causal data displayed visually. The participants for the study were given one of two different types of data visualizations showing either an instantaneous point or a summary of all the data. Half the participants were given a descriptive story about the data visualization, and the other half weren’t.

Example of what the different graphs used in the user testing looked like.

Those who were part of the group with the text were more accurate in their descriptions and they felt more confident about the statements they made. As the authors of the study had predicted, the narratives complemented the visualizations nicely and aided in the viewer’s understanding. Though the participants who read descriptions were more accurate in their analysis of visualizations, reading the section slowed them down. Some of the visualizations were more complicated, and these benefitted even more from the use of the narrative.

This is a graph showing the task correctness of the users in the different groups. The red group is the average, the black groups just looked at visualizations, and the blue groups were given visualizations and textual narratives.

Finally, the paper talks about a program called CauseWorks. This program is a data analysis tool that combines many different data analysis techniques including natural language generation — narratives generated by the computer. The authors presented this tool to experts in the field to get their perspective on it. The experts appreciated how the narratives were able to fill in the specific numerical data that can sometimes get lost in data visualizations. They also gave some constructive criticism, pointing out how CauseWorks would be better if it was able to generate a summary of the whole model, present information through ordered lists, and allow for interaction with the models.

Here’s a video from the authors about CauseWorks.

Choudhry et al found that textual narratives help users to understand complex data visualizations about causality. For the everyday person, we are most likely not creating or interpreting intricately woven data sets to figure out what caused what, but the findings in this study are still valuable. Perhaps it shouldn’t be a surprise to us that appealing to both the visual and linguistic centers of the brain would make understanding information easier. After all, we use picture books to tell stories to young children and even babies. As we move forward, creating important presentations for work, we must keep this truth in mind. While data can “speak for itself,” sometimes a little more guidance about how to interpret a visual can go a long way, especially when communicating complicated information about cause and effect.

tldr: Pictures are great, but pictures with words are even better at communicating causation.

References

  • Choudhry, Arjun, et al. “Once upon a time in visualization: Understanding the use of textual narratives for causality.” IEEE Transactions on Visualization and Computer Graphics 27.2 (2020): 1332–1342.

--

--