Something is Rank in the State of Data

Niklas Elmqvist

Follow

Published in

Sparks of Innovation: Stories from the HCIL

4 min readApr 20, 2019

--

Determining the best way to visualize ranked lists.

Exports and imports of Scotland from Christmas 1780 to Christmas 1781. From “*The Commercial and Political Atlas”*, 1786 (3rd edition, 1801) by William Playfair.

In 1786, William Playfair (1759–1823) invented the bar chart, which conveys values using the length of a rectangle. He did this to help members of the British parliament — many of them illiterate — understand complex data without the need for actual numbers. The bar chart has since become one of the most prolific and familiar types of statistical data graphics, and is a staple in many infographics. Bar charts are commonly used to visualize many items side by side, such as the gross domestic product of countries, the unemployment rate in U.S. states, or the enrollment in different academic units at a university. Such lists are often sorted, and we thus refer to them as “ranked lists” and their visualization as “ranked-list visualization.”

Horizontal bar chart showing a sorted list of 150 countries (scroll bars are used to see the full visualization).

However, while bar charts remain the dominant form of ranked-list visualization, they have an important drawback. As the picture above suggests, showing a long ranked list typically requires scrolling, as the whole list won’t easily fit on screen at the same time. For this reason, visualization experts have in recent years proposed several alternatives to bar charts.

Treemap showing a list of 150 countries, where the surface area of each rectangle corresponds to its value. Blue values are positive, red are negative.

For example, treemaps were originally designed for hierarchies, but are often used for ranked lists. In fact, their popularity is somewhat surprising since assessing the area of a rectangle is known to be more difficult than its length.

Packed bubbles, where the area of each circle (bubble) represents the value of the corresponding item. Color shows group membership. (Source: https://bl. ocks.org/mbostock/4063269)

Packed bubble charts use circles that are packed tight, their area conveying the value of each item. However, their layout is entirely random.

Wrapped bars of 150 items. Positive values are blue, negative red.

The wrapped bars technique was proposed by Stephen Few, and uses bars just like a typical bar chart. However, instead of forcing the user to scroll to see the full list, wrapped bars split the width of the chart into several columns so that all of the bars fit on the screen without the need for scrolling.

Packed bars that only show the largest (blue) and smallest (red) values in the center around the origin. Additional bars (gray) are packed into the same rows to the left and right of the original two columns.

Packed bars, proposed by Xan Gregg, also use horizontal bars across the width of the chart in a more space effective way. Instead of using columns, however, packed bars only draw the largest (and smallest if including negative values) bars in the list, and then “pack” the remaining bars into any available space on the same rows as the original bars. For this reason, packed bars are suited for skewed data distributions dominated by a few large items.

Piled bars for 150 values. Red hues are used for negative values and blue for positive. All bars use a common baseline, but are arranged so that larger bars are drawn behind smaller bars.

Piled bars, proposed by Adil Yalcin in earlier work, are a hybrid of wrapped and packed bars in that they reuse the same row for multiple bars. However, instead of using separate columns, they use the same origin for all bars. Larger bars are arranged behind smaller bars to avoid overlap.

Zvinca plot for 150 values. Blue dots are positive, red are negative.

Finally, Zvinca plots — presented by Stephen Few from an idea by Daniel Zvinca — are similar to piled bars, but use dots showing each value instead of the length of horizontal bars. This eliminates overlapping bars.

To once and for all determine the best ranked-list visualization, we ran a crowdsourced user study comparing all of these techniques for three tasks: (1) determining the rank of a single item, (2) comparing the difference between two items, and (3) assessing the average value of the entire dataset. Our findings are presented in a paper at the ACM CHI 2019 conference (May 4–8 in Glasgow, UK), and can be summarized as follows:

Simple bar charts with scrolling performed slowest of all techniques, but was also the most accurate. This is likely due to its familiarity.
Treemaps, while less familiar to most, performed surprisingly well. We speculate that this is because the treemap layout helps seeing ordering.
With the exception of wrapped bars, advanced ranked-list visualization techniques perform poorly. Again, familiarity may be a concern.
On the whole, wrapped bars have comparable accuracy to simple bar charts, yet is faster to use for most situations.

In summary, our recommendation is that designers pick simple bar charts if accuracy is more important than speed, that treemaps can be surprisingly effective (especially for large datasets), and that wrapped bars provide the best middle ground between accuracy and completion times.

More Information

Implementation: Chubuk.js (also https://packedbars.com/)
Experimental data and platform: Chubuk.exp
Paper: PDF

Publication:

Pranathi Mylavarapu, Adil Yalcin, Xan Gregg, Niklas Elmqvist. Ranked-List Visualization: A Graphical Perception Study. In Proceedings of the ACM Conference on Human Factors in Computing Systems, 2019.

Something is Rank in the State of Data

More Information

Written by Niklas Elmqvist