HindSight: Encouraging Exploration and Engagement in Data Visualization

This post is based on our paper presented at IEEE VIS 2016— HindSight: Encouraging Exploration through Direct Encoding of Personal Interaction History by Mi Feng (WPI), Cheng Deng (WPI), Evan M. Peck (Bucknell University), and Lane Harrison (WPI). To see our data, the paper, and interactive examples, please visit http://wpivis.github.io/hindsight/

If you love data visualization as much as we do, you probably believe that it can be a powerful tool for reasoning and understanding. Given the recent success of data journalism, you may also believe that data visualization makes complex topics more accessible to everyone, converting indecipherable sheets of numbers into the visual language of our brain. It seems that every day, a new visualization paints a compelling portrait of the world in 2016.

But hiding under this optimism, a depressing trend is emerging in the research surrounding data exploration and understanding: People aren’t exploring data. The New York Times is publishing fewer interactive visualizations because they’ve found that “users just want to scroll”, and a research study by Jeremy Boy showed that storytelling did not increase engagement or exploration in data visualizations.

Why aren’t people exploring data and discovering their own insights? Are people really understanding complex topics if they only attend to guided storylines? While it’s tempting to blame browsing habits or attention spans, maybe we haven’t equipped everyday people with the right tools for exploring complex visualizations. How can we help people better navigate visualizations on the web?

How can we help people better navigate data visualizations on the web?

Inspiration from the Purple Links

Fortunately, data visualization isn’t the only online domain in which people explore information-rich environments. Consider the following example: when sifting through hundreds of apartment listings on Craigslist, how do people know which links they have yet to visit?

Data visualizations aren’t the only information-rich environments where we interact with content. Links turn purple on websites like Craigslist when people click on them. Unclicked links remain blue

It’s pretty obvious, right? The visited links are purple. The apartments we haven’t seen yet are blue. It’s something we take for granted, but this simple design fills an important role — it not only shows us where we’ve been, but more importantly, it highlights where we have yet to explore.

The key here is that our interaction history is represented directly on the data itself (direct encoding) instead of somewhere on the side (indirect encoding). Direct encoding enables us to leverage our powerful perceptual system to not only understand relationships in the data but also our interaction with the data. The purple links also let us tackle a potentially overwhelming set of information iteratively, without having to recall all the places we’ve already been. It is straightforward and simple to understand. We rarely have to think about which links are purples. The encoding practically becomes invisible to us.

If these relatively simple designs benefit people in other information-rich environments, could they also work in data visualization? We think so. From here on out, we’ll use HindSight as a term to describe the design space of representing interaction history directly in existing data visualizations.

What does HindSight look like?

To explore HindSight’s core principle of encoding visualization interaction directly onto the data, we start with a well-known design: How the Recession Reshaped the Economy, in 255 Charts (The New York Times). It’s a brilliant visualization by Jeremy Ashkenas and Alicia Parlapiano that communicates how the number of jobs in 255 industries changed from 2004–2014. But it is also dense with information. Visit the link and try it out. What insights did you come away with? Was it easy to explore?

Instead of turning links purple to help people find their way, we created a modified version of 255 charts (try it yourself!) in which we slightly changed the darkness and width of visited industries. See HindSight in action in the gif below:

Although the overall change in saliency is small, the visited lines pop out to the eye. Similar to the Craigslist example, it is easy to see what data has already been explored and what data is yet to be explored.

A second example can be applied to The Rise and Decline of Ask MetaFilter. This small multiples demo by Jim Vallandingham presents users with area charts representing the frequency of topics over time at Ask MetaFilter. Users can hover over locations in each chart and also reorder them according to Name or Count. In our HindSight-modified version of the visualization (again, try it yourself!), we change the opacity of each area graph as users hover over it. See our gif of the interaction below:

Not only is it easy to see previously visited charts, but HindSight encodings preserve context when data is reordered. The change in opacity maintains our understanding of what has been visited even as spatial reference points are destroyed by the shifting locations of each chart.

Our hope is that HindSight not only makes it easier to see previously visited information, but actively encourages engagement by exposing what is yet to be explored.

Does HindSight actually help people?

For each of the previous examples, we used Amazon Mechanical Turk to enlist groups of people to interact with a version of the visualization with HindSight or a version without HindSight (the control condition). We recorded their behavior, and also asked participants to tell us 3–5 of their findings after seeing the visualization. For more details, see our paper.

Looking at behavior during interaction, we found that people generally visited more data using HindSight than without it. But more importantly, HindSight appeared to change which data that participants visited. Below, we show heat maps that compare the results of the two conditions. The thumbnails represent the number of times each graph would be visited by 100 hypothetical visitors, using normalized data from the experimental conditions. The larger charts contrast the two conditions, showing how many more people would visit each graph when HindSight was applied vs. the original visualization.

The results on 255 Charts (on the right) is a prime example of how HindSight can encourage interesting shifts in behavior. Without HindSight, exploration remained largely on the periphery of the chart, investigating the industries that were outliers of income or growth. With HindSight, people were more likely to dive into the dense center of the graph, perhaps attaining a more global perspective of the data.

We also analyzed the number of times each individual graph was mentioned in participants’ insights about the visualization. Below, we use a similar heat map to show our results:

Again, HindSight impacted which data that people reported in their insights. In Ask MetaFilter, people were more likely to explore and retain information about the bottom of the chart with HindSight — a region that may not normally get attention due to top-to-bottom reading processes. Similarly, in 255 Charts, people were more likely to remember and identify industries in the scarcely-visited center of the chart. As one person commented:

“…it was relatively easy to find the chart that I wanted to see again because it had been changed to a bolder and darker line which is a great feature seeing as how there are a whole bunch of lines mixed up together”

Finally: Building Tools for Everyday People

There is a lot to be learned about the effect of HindSight (When is HindSight ineffective?* Are these nudges always good?), but it’s clear from our study that representing interaction history directly on a visualization impacts how people explore data. Visualizations that use HindSight often nudged users to explore different data during interaction and report more diverse findings after interaction. While interaction history is far from a new topic in data visualization, HindSight shifts the traditional notion of history from “How did I get here?” to “Where have I been before?” and “What is left to explore?” Perhaps these simplified definitions are more effective at capturing the engagement of more diverse audiences.

To keep this post (relatively) short, we’ve omitted a lot of our paper — design space definitions, design recommendations, use-cases, and a third experimental visualization. But there is at least one more point worth mentioning here. When researchers invent new interaction mechanisms, we rarely talk about the cost of implementation. It should come as no surprise that designs with a high technical cost are less likely to be created, no matter how elegant they might be. Applying HindSight to existing interactive visualizations (especially those built with d3) is often very simple, requiring only a few lines of code to trigger a visual change based on mouse events. We’re excited by the prospect of HindSight being prototyped quickly and easily today, with almost no cost to the designer.

To return where we began, visualizations are more frequently encountered by everyday people on an everyday basis. While we hope that the spread of data visualization broadly facilitates reasoning of the world around us, there are still barriers that impede understanding. Research should focus on developing a design space of low-barrier interaction strategies that benefit people without expertise or training. HindSight is just one of many possible contributions in this space.

To support further research in this space, we are releasing all experiment materials, data, and analysis scripts on GitHub.

* Note: We also tested a 3rd visualization that exhibited different behavioral patterns than those written about here. We omit it for the sake of brevity, but discuss it in depth in the paper. We encourage you to read it for a more nuanced perspective of when HindSight might not create noticeable nudges.