How can we tell when a responsive visualization retains the message of a source view?

--

This post summarizes the paper “An Automated Approach to Reasoning About Task-Oriented Insights in Responsive Visualization,” by Hyeok Kim, Ryan Rossi, Abhraneel Sarma, Dominik Moritz, and Jessica Hullman, to be presented at VIS 2021 (preprint).

TL;DR

Responsive visualization design is hard because authors have to manually create different possible responsive designs and evaluate how well each preserves the insights or takeaways of the original visualization. To enable automated recommendation of responsive designs that best preserve the ‘message’ of a source visualization, we devised a set of task-oriented insight preservation measures to approximate changes in a viewer’s ability to identify data points, compare pairs of data points, and recognize implied trends in a visualization.

Responsive Visualization

An increasing number of viewers are accessing data visualizations on their mobile devices. Responsive design is necessary to ensure appropriate readability of web-based visualizations because on the Web, viewers access the same contents (i.e., the same URLs) using devices with different screen sizes (e.g., desktop, smartphone, tablets, etc.). Responsive web design refers to designing content for different types of devices with varying screen sizes (e.g., a large screen + a keyboard and a mouse; a small touch screen). Fortunately, designers do not have to deal with every possible screen size, but they use a set of breakpoints and the design changes at each breakpoint. A common set of breakpoints include desktop (large), tablet (medium), and smartphone (small). Making an analogy to responsive web design, responsive visualization design means creating multiple versions of a visualization for different screen size breakpoints. However, few tools currently exist to make this process easier for visualization authors.

There are one desktop version of a scatterplot and three of its mobile version alternatives.

Mobile version alternatives given a source desktop version. Which of the mobile views do you think best preserves the message of the desktop view, and why?

Density-Message Trade-offs in Responsive Visualization

While there are useful responsive design techniques for some forms of web content (e.g., modern UI libraries like Bootstrap and Material UI), however, they are of limited use for addressing visualization-specific challenges. For example, dynamic resizing of a chart may result in an unreadable chart on smaller screens because visualization elements (e.g., marks, lines, texts) cannot be perceived beyond a certain size, but maintaining a minimum mark size could result in overplotting on a smaller display. CSS rules might not work to control some visualization elements associated with data points due to underdetermined CSS selectors

This means that while modifying a visualization design for different screen types, visualization authors need to think carefully about how takeaways or insights in their visualization may change. Authors don’t want to make their work too dense or too sparse by simply shrinking or enlarging it. At the same time, they want to minimize changes to messages, insights, or take-aways in their visualization as much as possible, so that the visualization’s intended point or role in an article is consistent across screen sizes. As shown in Figure 1, for example, proportionately reducing the chart size of a desktop version visualization to fit a mobile screen can make it difficult to recognize the data marks. To make better use of screen space and ensure data marks are all identifiable, the authors transformed it to a different chart type as shown in Figure 2. Our prior work characterized this challenge as the “density-message trade-off’’ in responsive visualization, where authors want to make a balance between adjusting graphical density and preserving aspects of a visualization’s “message.”

There are two one-dimensional dot plots where dots are horizontally positioned using the same data. The first dot plot is a desktop version, and the second is a mobile version. The desktop version has width of 800 pixels and height of 170 pixels. The mobile version has width of 345 pixels and height of 75 pixels. The mobile version is proportionately rescaled from the desktop version. This example case demonstrates that simple proportionate rescaling can result in unrecognizable marks.
Figure 1. Simple rescaling of a desktop version of a dot plot for a mobile screen can result in overplotting. The image is created with mock data. Text elements and references are dropped for demonstration purposes. Original article.
There is one bar graph, where bars are horizontally oriented and vertically positioned.
Figure 2. The original mobile version for the desktop version in Figure 1. The image is created with mock data. Text elements and references are dropped for demonstration purposes. Original article.

An Automated Approach to Responsive Visualization Recommendation

One way to help authors navigate the density-message trade-off is through intelligent authoring tools for responsive visualization, such as design recommenders. A common practice in designing responsive visualizations is to (semi-) finalize a visualization and then create other responsive versions of it by applying transformations to the original. Authors often start by creating a desktop view because they work on desktop devices. Based on this practice, a pipeline for a responsive visualization recommendation might look like the one shown in Figure 3. Our proposed recommender takes an input visualization (source view), and first generates a search space of different small screen view alternatives. When generating a search space, a recommender should consider strategies that adjust graphical density (e.g., fewer marks for small screen views) and maintain well-formedness (e.g., preventing inexpressive combinations of strategies). The recommender then evaluates how well the alternatives preserve important information or “insights” captured in the original, and then produces a ranked set of small screen views (target views).

This figure summarizes our proposed pipeline for a responsive visualization recommender. The figure is also described in the text.
Figure 3. A pipeline for a responsive visualization recommender.

Insight Preservation Measures

In the above pipeline, our work focuses on insight preservation measures (or insight loss measures) that approximate changes to support for task-oriented visualization insights between two views. At a high level, we define task-oriented insights as insights or information that viewers can achieve by performing basic visualization tasks. Motivated by visual analytic tasks (Brehmer and Munzner, 2013; Amar et al., 2005), our work considers identification (or each data point), comparison (between pairs of data points), and recognition of trend (of two or more variables) as distinctive tasks.

Prior work on insight-based visualization recommendations has estimated visualization insights by measuring statistics on data values (e.g., mean, K-means clustering, correlation coefficient). However, responsive transformations do not necessarily change the underlying dataset (or the subset of data), so those measures mostly remain the same under such transformations. To calculate task-oriented insight preservation between a source visualization and a target we need to work on the “rendered values,” which are defined in the space implied by the visual variable (e.g., pixel space for position or size, color space for color).

One key form of information gained from a visualization is simply to recognize the data values. Motivated by transformations to data like aggregating and binning (Figure 4), we use identification loss to refer to changes to the identifiability of rendered values between a source view and a target. Considering a visualization as a signal and encoded attributes as bits, we approximate identification loss using Shannon Entropy.

There are three pairs of a large screen view and a small screen view with the specific responsive transformation noted. The first case is showing a transformation of increasing bin size, where the number of marks are decreased from 20 to seven. The second case illustrates data aggregation, which reduces the number of marks from 23 to seven. The third and last case shows filtering data, after which the number of data points reduces from 19 to 13.
Figure 4. Responsive transformations that may cause identification loss.

Responsive strategies like resizing and rescaling (Figure 5) might not change which values a user can recognize, but visual comparisons that the user tries to do between points. We define comparison loss as changes to discriminability between pairs of points in a target view compared to its source view. We estimate it by measuring the difference in the distributions of pairwise distances from source to target. Here the distances are measured via distance metrics specific to the encoding channel: if it’s position, simply pixel space; if it’s color, we use the CIELAB 2002; if it’s size, pixel space adjusted by a Stevens’ exponent of 0.7; and for shape, we use a perceptual kernel.

There are three cases of responsive transformations from large to small. In the first case resizing reduces the difference of 50 pixels between two bar heights to 15 pixels, implying changes to the magnitude of difference. In the second case aggregation makes two bars not comparable on small screen, implying changes to the number of possible comparisons. In the last case , color scale change reduces CIELAB distance of 112 to 103, implying changes to the magnitude of difference.
Figure 5. Responsive transformations that may cause comparison loss.

Responsive transformations like disproportionate rescaling and changes to the number of bins in an aggregated visualization (Figure 6) will affect another common visualization task–estimating a trend in the relationship between multiple variables. We use trend loss to refer to changes to implied relationship or trend between two or more variables from the source to a target. We approximate trend loss by calculating and comparing LOESS-estimated trends on rendered values of a source and target view.

There are a large screen view and six responsive transformation cases. There are three different versions of aspect ratio change, and two different versions of changing bin buckets. The last small screen view is a combination of aspect ratio change and bin bucket change.
Figure 6. Responsive transformations motivating trend loss.

You can find complete definitions of these measures in our paper.

Prototype Responsive Visualization Recommender

We tested our loss measures by developing a prototype recommender system for responsive visualization. Our recommender generates a set of well-formed target views, and then evaluates and ranks those target views using our loss measures. To combine the three loss measures to rank target responsive visualizations given a source visualization, we used a machine learning (ML)-based ranking model that predicts pairwise rankings given a set of features. We use our loss measures as features in a variety of ML model types, including SVM, logistic regression, decision trees, multilayer perceptron, and random forest. We collected ground truth rankings to compare the recommender’s predictions to by having nine visualization experts rank sets of responsive alternatives for a set of six source visualizations, with the goal of ordering them by how well they retained any important messages or insights in the source visualization.

Our ranking models showed up to 84% or accuracy in predicting the pairwise rankings. To get some perspective, we compared our models with two simple heuristic-based baselines for changes to chart size and transposing axes, and the models with our loss measures all outperformed those baselines (< 65%).

This is a bar chart with two groups where bar height means accuracy. In the first group, there are four bars about models with our loss measures. They are: a random forest with 100 estimators, 84.07%; a random forest with 50 estimators, 82.38%; a Gradient boosting model, 82.94%; and a logistic regression model, 76.38%. In the second group, there are two bars about Linear Support Vector Machine with baseline features. They are: chart size change, 59.04%; and axes transpose 63.17%.
Figure 7. Selective results of prediction accuracy of our models (unit: %). “e” stands for the number of estimators in a random forest model. The full results are shown in our paper.

What’s Next?

Compared to black-box models that one might train to predict rankings of responsive views given a source visualization, the measures we devised are easy to reason about and interpret. For example, Figure 8 shows an example prediction made by our best performing model (random forest with 100 estimators), given a simple scatterplot as a source view. Rank 1 appears higher in ranking than Rank 2 because Rank 2 has higher trend loss, while Rank 1 slightly sacrifices comparison loss. Rank 4 is ranked in a higher position than Rank 5 because Rank 5 has higher comparison and trend loss. Because they are interpretable, we think a useful next step is to test how authors use them in mixed initiative authoring tools, where given a source visualization, they can potentially tune the rankings to support different communication goals.

This figure is composed of the upper and lower areas. The upper area has the source view, a simple scatterplot. The lower area has five target views with loss values. This reads ranking, changes, identification loss, comparison loss, and trend loss. First, resizing, 0, 0.73, and 1.33. Second, axes transpose, 0, 0, 2.39. Third, resizing, bin, aggregation, 4.42, 0.63, 1.52. Fourth, resizing, bin, aggregation, 5.21, 0.97, 0.69. Fifth, resizing, bin, aggregation, mark type change, 4.42, 0.99, 1.28.
Figure 8. Prediction example case made by our best performing model (random forest with 100 estimators).

Naturally, our measures are not an exhaustive set, and future work might extend them with loss measures for different insight types or using different formulations. One might also refine loss measures using perceptual experiments to capture hard-to-predict perceptual biases.

--

--