Taking Word Clouds Apart

Alternative Designs for Word Clouds and Some Research-Based Guidelines


What do you think of word clouds? Good? … Bad? … Don’t care? Would you use them in your projects if needed, or you would you seek alternative designs?

Word clouds are a go-to visual representation when the goal is to visualize the frequency or relevance of a group of words to summarize some text. The image below shows a typical design. Common variations include tilting the words, using different layout algorithms and coloring the words to make the word cloud more “pleasing” or to more accurately convey information.

Traditional style used in a word clouds.

When you look at the landscape of word clouds designs and projects however two things stand out. First, there is not much of a description of the possible alternatives to the standard design solution. How else could you visualize words with associated frequencies? Second, there is little empirical evidence guiding the designers in making a choice among possible designs.

In our paper [1] just presented at IEEE VIS’17 called, Taking Word Clouds Apart: An Empirical Investigation of the Design Space for Keyword, we set out to move first steps towards improving this situation. Practitioners will likely be most interested in two particular contributions: (a) a systematic description of a possible design choices and (b) the empirical study of the alternatives proposed in our framework.

The Design Space of Keyword Summaries

The first problem we set out to solve is the definition of a visual design space able to cover relevant design features. The two main features we chose are:

  • Layout: The strategy used to position the words;
  • Value Encoding: The visual channel used to encode magnitude.

The design space includes three main layouts:

  • Spatial: Words are arranged without any particular alignment;
  • Column: Words are arranged aligned in multiple columns;
  • Row: Words are arranged aligned in multiple rows.

The design space also includes five value encoding strategies:

  • Font Size: Values encoded using font size.
  • Color Intensity: Values encoded using color intensity.
  • Bar Length: Values encoded using the length of an associated bar.
  • Circle Size: Values encoded using the area of an associated circle.
  • No Encoding: Values not encoded (see below for why this was important).

Combining these two parameters we can build a total of 15 different designs:

The 15 different designs one can build combining the 3 layouts and 5 value encoding strategies.
Which one works best?

Before answering this question we have to answer another question:

“… best for what?”

Tasks

We decided to test two main sets of tasks to evaluate the 15 designs:

  • Low-level, to study simple tasks involving finding values or words;
  • High-level tasks, to study performance for tasks requiring summarization across many words.

More precisely, we used these four tasks:

  1. Compare Values: Compare magnitudes of two selected words;
  2. Search for Words: Look for a specific word contained in the cloud;
  3. Identify Topics: Choose which topic, in a list, a word cloud correspond to;
  4. Build Topics: Describe which topics are included in a word cloud.

Experiments

In the following I summarize only the most interesting facts. You can find way more details in the paper.

  1. Aligned bars are best for reading values accurately. When we asked our participants to compare values associated to the words in the keyword summary, we found that bar and circle lead to more accurate estimates (which is in line with many similar previous studies on graphical perception). Conversely, when quantities are mapped to font properties people do not derive accurate estimates out of them.
The amount of error we found in different designs, when participants are asked to compare magnitudes associated to two words. Bar and circle score better than font size and color intensity.

2. Font size and color intensity work best for word search. When we asked our participants to search for specific words, the results flipped: font size and color intensity made it faster to provide an answer.

The amount of time it took participants to find specific words with different designs. Font size and color intensity score better than bar and circle.

3. Font properties provide an advantage only when the target is large. It’s important to realize that the advantage of using font properties in search vanishes as soon as the target word is no longer associated with a large or medium magnitude. This is what you can see in the chart below. Differences in time vanish as the target gets smaller.

Time it takes to find a word when associated to different magnitudes. The advantage of font properties vanished when the target gets smaller.

4. All these advantages do not seem to be reflected in complex tasks. In the study we tried to see if the differences captured in the value estimation and search tasks would be reflected in more complex real-world tasks such as identifying which topic the word clouds would be about. Our results however are largely inconclusive. We could not find any major effect of visual encoding and layout on these complex tasks.

5. Simple lists seem to work surprisingly well! A surprising result is that when we asked people to solve complex topic extraction and identification problems simple lists with no magnitude value encoded in the representation performed pretty well. For sure they did not perform worse.

The image below shows some of the results. The black item called “control” is the simple list and it led to slightly better coverage of topics when the participants were asked to identify topics in the word cloud.

In turn, this raises the important question of whether it makes sense to map values to words as additional marks at all. Of course this is highly dependent on the application but it seems useful to know that simple lists work so well for making sense of topics. They free up opportunities for many other designs. So, in short, lists may not be fancy, but they do their job pretty well, or at minimum, no worse.

Coverage and accuracy of topics detected when asked to identify topics in the word cloud. Performance is similar across all conditions. Simple lists (control) work pretty well.

Guidelines for Practitioners

What should you do then? These are some guidelines from our side:

  1. Experiment with different designs. People have not systematically experimented with alternative representations for keyword summaries. Don’t assume word clouds are the only way to go! Try many alternatives and see what you like.
  2. Give column bars a chance. Column aligned bars many not be particularly catchy but they seem to do a decent job throughout the spectrum of things we tried. And they have the advantage of being easy to sort and intuitive. It’s ok to arrange bars in multiple columns to obtain more even aspect ratios. Give them a try and see how you like them.
  3. Give other layouts a chance. You can still use font size or color intensity but experiment with different layouts! The column layout seems to perform well regardless the strategy used to encode the values.
  4. Simple lists may be ok. They may not be fashionable, or even useful for your specific purposes. But our experiments seem to suggest you will be doing just fine with simple lists.
  5. Think sorting. If you are working on an interactive project or application, think about providing sorting capabilities. Being able to sort the words alphabetically or by value may be of great help to answer different questions in a data analysis environment.

Contact Us!

We are looking for people who need help with designing word clouds or developing systems that use them or have developed them in the past. We want to hear from you. Your experience in past projects can teach us how our research can be more useful, so that we can assist practitioners with new projects.

Credits

The bulk of this work has been done by my student Cristian Felix, who did an amazing job with absolutely everything. I just acted as a supervisor for the work. Steven Franconeri, wearing his vision scientist hat, made sure we did not do anything tragically wrong. Hopefully we did not! Steven and Cristian also proofread this article and provided me with some really good advice to improve its legibility.

[1] Taking Word Clouds Apart: An Empirical Investigation of the Design Space for Keyword Summaries. Cristian Felix, Steven Franconeri, Enrico Bertini. IEEE Transactions on Visualization and Computer Graphics (Proc. of InfoVis), 2017.

Like what you read? Give Enrico Bertini a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.