Six Important Lessons from Storytelling with Data by Nussbaumer Knaflic-Part-1

Mohammad Mohtashim
Analytics Vidhya
Published in
10 min readJul 7, 2020

--

In this series of articles, I will be sharing a short summary of the book Storytelling with Data by Nussbaumer Knaflic. All of the content is taken from her book but I have tried to summarise the content by explaining the six important lessons taught in the book regarding the subject matter. I hope these six lessons would go a long way in improving your visualization expertise and hence in the art of better storytelling from Data.This would be a series of articles ,in each article I would be summarising one or two lessons. In this part, I will be summarising the first two key lessons.

Image-0-Courtesy of Storytelling with Data by Nussbaumer Knaflic

1.Understand the context

Nussbaumer explains how it is fundamental for you to understand the context for which you are going to use the data and generate value. Her tip is to ask three important questions to understand the context and then using this context to adapt your generated visualization.

a)Who?

She mentions repeatedly throughout the book that we must understand the audience for your visualisation. This way you will be able to adapt your visualisation according to the needs of your audience and avoid general graphs. Moreover, the needs of the audience allow us to be specific in using our data and hence generate better value.

b) What?

Firstly, you understand the underlying action that you want your audience to take by appreciating what do you need your audience to know or do? Afterwards, you ask what is the mechanism that you are going to use to communicate with your audience. Filtering out the most effective medium of communication to interact with your audience. The below illustration is quite effective to further elaborate the mentioned point:

Image 1-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Finally, you ask yourself as ‘what tone do you want your communication to set?’[1]

c) How?

Finally, this is the last question that you ask which is like the innermost layer of the series of questions you ask. You understand as what data will I need to use to elaborate my principle point which has been polished by asking the questions ‘What’ and ‘Who’.

Just to summarise, Nussbaumer Knaflic has given the following example to reflect the importance of understanding the context by asking the three mentioned questions:

Image 2-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Finally, she talks about the principle of 3-minute Story and Big Idea. According to her, it is imperative you can summarise complete content that you would like to communicate in a 3-minute story which according to her is, ‘if you had only three minutes to tell your audience what they need to know, what would you say?’[1]. Moreover, the Big Idea is the main principle around which you are revolving around when utilizing the data to generate value for your audience. The idea of Big Idea is summarized well in the following illustration:

Image 3-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Therefore, the first lesson is to understand the context and then embarking on your journey of utilizing the data as per the understood context.

2. Choosing an Effective Visualisation

Next, Knaflic gives an important lesson which fundamentally teaches the reader to learn as to how to adapt to different choices of the graphs given the context. The author believes that the choice of the graph, according to the situation and data, is fundamentally important for effective communication. I will be restating all of the main graphs which she mentions and the situation she prescribes for the respective graphs.

1-Simple Text:

To my surprise, Knaflic warns that sometimes many people overdo it by bringing in the extra hassle of visualisation when the job is simple enough that it could be completed by the use of simple text. As she mentions,’ When you have just a number or two to share, a simple text can be a great way to communicate’[1]. To illustrate this, she gives the example as to how in April 2014 Pew Research Center used the following bar plot to compare the percentage of the married stay-at-home mothers with a working husband[1]:

Image-4-Courtesy of Storytelling with Data by Nussbaumer Knaflic

However, she shows how this is just overdoing as the same message could be delivered more effectively through simple text:

Image-5- Courtesy of Storytelling with Data by Nussbaumer Knaflic

2-Tables:

Tables are great when you want to communicate to a mixed audience whose interest lies in particular rows of tables. Further tables also to communicate in a detailed manner and the type of data being communicated could be diverse with the different metric systems. However, she warns that when using the table, your data must stand out and the design of the table must fade out:

Image-6-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Knaflic encourages to use the Minimal borders as it allows the audience to not pay attention to extra clutter(more about this in later articles) and just focus on the data.

2a) Heatmap

If you want to combine visual cue with the table then you can make use of colour to highlight the range of data and therefore making a heatmap. Mostly, there is a colour range and an associated key which is used to reflect the magnitude of the data(numbers) and consequently allow the audience to focus only on important rows. However, she suggests that the colour being used must be consistent and it must be not to dull or not too vibrant. She shows a good example of a heatmap with the proper use of the colour range:

Image-7-Courtesy of Storytelling with Data by Nussbaumer Knaflic

3-Graphs:

She then states different types of graphs which can be used to deliver the value proposition of your data:

3a)- Scatterplot:

The scatterplot is mainly used to describe the relationship of two variables which are quantitative(continuous). However, she also mentions how colour can be used in a scatterplot to highlight different aspects of the data. For example, she mentions as to how a scatterplot of miles and cost per mile can be used to highlight values which are above the average cost per mile:

Image-8-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Just look as to how simple the scatterplot is with no added fanciness at all and how colour contrast has been used to highlight the main message of the data.

3b)- Line graph:

The line graph is mostly used when you want to visualise time series data. Moreover, the line graph can be used to visualise either single series or multi-series data:

Image-9-Courtesy of Storytelling with Data by Nussbaumer Knaflic

She emphasis that time-range interval(x-axis) must be consistent and therefore you should follow the same range for every time interval. For example, don’t follow the 10-year gap from the years 1970–2000 then start following one year gap. I know this mistake seems quite obvious but even I have been the victim of such a mistake in my journey of data science.

3c)- Slopegraph:

According to author slope graph must be used when,’ when you have two time periods or points of comparison and want to quickly show relative increases and decreases or differences across various categories between the two data points’[1]. The use case of slope graph is self-explanatory, just remember that slope graph should be used when comparison at glance between different variables is more important than going into the details of the variable pattern over the years. The author shows the following example of a slope graph:

Image-10-Courtesy of Storytelling with Data by Nussbaumer Knaflic

However, slope graph can become confusing if you have too many lines therefore, in this case, it is better to make use of colour to highlight a particular aspect of the underlying data.

Image-11-Courtesy of Storytelling with Data by Nussbaumer Knaflic

4- Bars

The bar should be used when you want to plot categorical data over different groups. However, the author states how many people tend to understate the importance of bars by labelling them as too simple. However, this should be avoided because the power of bars could be leveraged to represent the relationships among different categorical variables(qualitative variables). Furter the advantage of bars is that they are too easy and simple to understand.

One of the major tips when constructing bars is to have a zero baseline to make the comparison easier and unbiased between two bars. As shown in the example below that the non-zero baseline could distort the relative comparison of two bars:

Image-12-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Further, the authors also recommend that the width of the bars must not be greater than the white space between them.

Image-13-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Now, I will briefly some variation of bars before ending this article:

4a)-Stacked vertical bar chart:

Image-14-Courtesy of Storytelling with Data by Nussbaumer Knaflic

The author warns that stack vertical bar charts can become overwhelming as the comparison across different subcomponents is difficult, as shown in the above picture that comparing the blue region, when its near baseline, is easier. Therefore, be extra cautious when using a stacked vertical bar chart.

4b)-Waterfall chart:

It was the first time through this book that I was exposed to this variation of the bar.

Image-15-Courtesy of Storytelling with Data by Nussbaumer Knaflic

The example author gives is as to how the headcount of the employees changed over one year(where x-axis shows the reasons for respective additions and deductions). Even though at first glance water chart might seem complex, but can be useful if you want to explain as to how the count of a variable changed over a specific time and what were the reasons for the change in the count.

4c)-Stacked horizontal bar chart:

The advantage of horizontal bar change is that it makes it easier to read the name of categories especially if they are longer. Further, authors state that most of the reading is done from left to right, therefore, horizontal bar chart allows the reader to first process the categories and then look at the data, which makes the processing of the data easier.

Image-16-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Moreover, the authors believe that stacked horizontal bar graphs are easier to process, as comparing the sub-components across the bar is easier because you get a consistent baseline from both far left and far right:

Image-17-Courtesy of Storytelling with Data by Nussbaumer Knaflic

5-Area:

Another not so common graphs she mentions is Area graph. Even though her general advice is to avoid such graphs as, “Humans’ eyes don’t do a great job of attributing quantitative value to two‐dimensional space, which can render area graphs harder to read than some of the other types of visual displays we’ve discussed”[1]. However, she does suggest that when you need to ‘compare numbers of vastly different magnitudes’[1], the 2d aspect of Area graph makes the comparison more compact as compared to 1d bar.

Image-18-Courtesy of Storytelling with Data by Nussbaumer Knaflic

WHICH GRAPHS TO AVOID?

1-Pie Charts:

Pie chart makes the comparison of different proportions very difficult especially if two proportions are quite close to each other. The reason for this difficultly is that the pie chart is generally hard to read. As an example, she gives the following pie chart and asks the reader to answer ‘Which proportion is the largest’:

Image-19(Can you tell me which is the larger pie?)-Courtesy of Storytelling with Data by Nussbaumer Knaflic
Image-20(No, Supplier B is not the largest)-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Even though labels can be used but still the pie chart has a lot of clutter(more on this later) therefore is not a good way of telling your data story. Further, a good and better alternative to pie chart is the horizontal bar graph:

Image-21(Same Information on Bar Chart, Isn’t comparison easier now?)-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Moreover, she also prohibits, for the same reasons, another visual treat i.e Donut chart:

Image-22-Courtesy of Storytelling with Data by Nussbaumer Knaflic

2- Never use 3D as an effect:

Image-23-Courtesy of Storytelling with Data by Nussbaumer Knaflic

The 3d effect adds extra clutter and unnecessary elements like side, floor panels, shadows. Moreover, reading off the axis with the 3d effect is also an extra hassle(just see the above bar chart). Therefore the author prohibits the use of 3d effect. However, three-dimensional graphs are legitimate in the case of three-dimension plotting.

3-Secondary y‐axis: generally not a good idea:

Image-24-Courtesy of Storytelling with Data by Nussbaumer Knaflic

This again makes reading the data more time consuming and is difficult to understand. The author proposes two alternatives:

Image-25-Courtesy of Storytelling with Data by Nussbaumer Knaflic

Conclusion:

In conclusion, this article summarises the first two key lessons of the book Storytelling with Data by Nussbaumer Knaflic. The first is to always appreciate the importance of context and fully understand it before embarking on your visualisation. Second, is that given a wide choice of graphs, you must choose a graph which best suits your context and even that choice should be modified in a way which makes the communication more transparent and easier.

In part two of this series of articles, I will discuss the third key lesson of the book.

References:

[1] Knaflic, Nussbaumer. Storytelling with Data.

--

--