Visualising Our World Today: Storytelling With Data
The ‘Why’s and ‘How’s of Creating a Visual Narrative
What does it mean to visualise our world? There is a constant abundance of information available to us, and sometimes it is hard to comprehend the deeper story without the use of graphical or visual aids. With the accumulation of mass amounts of data comes the desire for us to make sense of it all.
According to Cole Knaflic, author of the book Storytelling with Data, an effective data visualisation can be the difference between success and failure in any situation: presenting findings to board members, getting key points across to your team, or even fundraising for charities. The tool does not matter as much as being able to convey a concise, focused data story.
The ability to create visual stories from massive amounts of data is central to turning information into decision-making. In fact, it is deemed so important that Forbes reported it as being one of the most essential skills for data scientists. But this is not a skill we naturally possess as humans. Tom Davenport, an independent adviser for Deloitte Analytics, believes that most quantitative analysts are not very good at creating or telling stories with data, meaning analytical initiatives will not have the same impact on decision-making processes. Data storytelling is important because:
- Stories are how we make sense of our complex world and share human experiences
- The goal of analytics is to change how someone makes decisions, and stories are ways to compel change
- Stories that use data and analytics are more convincing than stories based on anecdotes or personal experience
- Stories are a way to present the results of time-consuming, complicated data analytics in a brief, engaging manner
- At the core of storytelling with data, there are only a few main types; if an organisation is clear about what stories can be told with data, analysts are more likely to explore and utilise a variety of data stories in the long-term
THE GOOD
What are some examples of good data storytelling? Below are two examples from two different newspapers.
The Financial Times has created a detailed, interactive visualisation to live-track COVID-19 cases and deaths around the world. Using tools like JavaScript (you can watch a video about the graph here), the data can be plotted in real-time, with the added ability to choose specific areas of interest. Many issues with COVID graphics are they do not account for population density. What the FT team have done is allow you to change the measures to your liking: you can observe the raw numbers, or cases/deaths per million; you can look at counts for new cases/deaths, or the cumulative number; or even adjust the scale to be linear or logarithmic. These details help create a more clear, accurate narrative, because it makes it easier to compare nations than raw numbers alone.
Similarly, their map visualisation allows users to easily visualise daily changes to national stringency measures related to the adherence of reopening guidelines amidst the pandemic. With these graphs, you can see the severity of the pandemic across different nations, and see how national-level easing of lockdowns can change daily. There are two options: you can see the global map change day by day for more global comparisons, or look at their heatmap below to get an idea of how a single nation is responding over the same time-span. (It is worth noting that this Stringency Index was created by Oxford University in order to make comparisons across countries more viable, and it is only based on publicly available data on each nation’s government). Though there may be issues with the measure itself, the visualisations make it easier to understand the global response to the pandemic, and actually see how certain nation’s responses changed over time.
A very different example is from the New York Times, which published an article last winter about the use of geolocation data to track people’s movement over a period of months. These geographic data points are obtained from mobile phones, called “pings”, and are collected, stored, and used by many private corporations in unregulated and unethical ways (for more on the ethicality of this, please see last issue’s article about data ethics). The visuals created for this article portray the vastness of data collected across major US cities — even in the vicinity of secure buildings like the Pentagon and the White House. When compiling weeks’ or months’ worth of pings from a single device into a map, you can more easily comprehend how this data can be used to recreate a geographical history of a single person, and how this could have major implications for violating our rights to privacy.
Although these are considered good examples of data storytelling, do you think anything else could have been done to make it clearer? What would you have changed?
THE BAD
Telling data stories requires time, effort, and creative thinking. When data storytelling goes wrong, it can have consequences (intended or unintended) as to how we perceive the information we are given, and can impact our ability to make accurate, data-driven decisions. What does it look like to tell a “bad” story?
The image below depicts the number of COVID-19 cases in the state of Georgia, in the southeast United States. These maps were taken from the interactive maps created by the Georgia Department of Public Health (which has since been removed for updates). It garnered much criticism from academics and many news outlets alike, when people looked into these visualisations. At first glance, the maps do not look too different, each reporting cases per 100,000 people. But, if you look closely at how the maps define the ranges for the number of cases, there is one glaring difference: the heatmap densities for any two days are different. This has since been reported as being intentional, claiming it was for the benefit of all people living in Georgia, and was not intended to be used to track changes over time. Besides, this is not the first time the Georgia Department of Public Health has had major issues with reporting COVID-19 cases.
Visualisations can be misleading for a number of reasons, including the difficulty in its interpretation. The big picture is that the average person is not a data analyst, nor do they understand the nuanced ways in which visualisations can or cannot be used. With respect to Georgia’s government, many argue it is just one more way in which the state is trying to misrepresent the extent of the pandemic in order to resume normal life. The Atlanta Journal Constitution — the only major newspaper in the Atlanta metro area — has actually used the Department of Public Health data to recreate their own dashboard to more accurately reflect the true spread of the virus.
This is a prime example of how you can use data to mislead an audience, and create an inaccurate narrative. Even though they are presenting true values, the visual cues used can trick our brains into processing this information differently than how it is presented. Most people will spend a few seconds at most looking at a graph, so visual cues and accuracy are extremely important. By intentionally changing characteristics of the graph counter to how our brains perceive things at first glance, a visual can be used to mislead and create a new, false narrative.
THE WHY
All of these examples, while different, all have one thing in common: they tell a story. These narratives might be accurate or inaccurate, but they all convey a message nonetheless. This is the power of storytelling with data — the ability to reduce complicated information into a simple narrative; what Knaflic calls the “so what” statement. If you are able to condense all your findings into a single sentence, you can focus your design and visualisation to convey this statement.
After seeing all of these examples, one question might come to mind: why are we so bad at data storytelling? There are a variety of reasons, but it can be condensed into two main factors: many analysts do not have the creative design skills or ambition to create impactful, meaningful, or clear visuals; and quantitative persons believe simple graphs devalue their technical capabilities and undermines the quality of their analyses. However, as we have mentioned at the beginning of this article, the inability to clearly and concisely use data to make a statement can have negative impacts on how businesses utilise this information.
THE TOOLS
In many of the examples shown, storytelling with data requires tools that are a bit more advanced than what you may have been introduced to during the apprenticeship. By now, you should be (or will soon be) familiar with the basics of data visualisation, and are able to apply these skills throughout the course, and more broadly in the workplace. Yet there might be instances where you are not able to convey the story you want to tell with your standard toolset. What else is available to help enhance your storytelling abilities, but at the same time do not require overly-advanced programming or digital design skills? As it turns out, there are many open-source, web-based alternatives where you can incorporate these very same ideas and aesthetics into your own visuals, without needing to learn a multitude of programs or languages.
Though visualisations are only part of data storytelling (quality data, engaging with your audience, and establishing a narrative are all just as important), below are some additional tools that can help expand your skill set.
Open-source tools like D3.js set the standard for impressive, interactive visuals, but its usage is limited by the need to know how to code in JavaScript. In recent years, a variety of open-source tools like Observable have made it easier to access the power of D3 visuals with only basic knowledge of JavaScript and how programming languages work. You can upload your own data, and change the code bit by bit until it looks the way you would like it to appear. If the thought of editing existing code (even to a small extent) is still daunting, another open-source tool built on top of D3.js is RAWGraphs, an easy-to-use online app. Similar to Observable, you can upload your own data, and choose a variety of graphics to display and export for later use. RAWGraphs also has the option to download the package from Github and code your own custom visualisations. Remember when uploading data to respect your organisation’s data sharing policies.
Another tool, called Palladio, was developed by the Humanities + Design lab at Stanford University, and makes it easy to upload and investigate geographic data. As it does not require an account to use, it does not store the user’s data, making it a secure option for quick geographic visualisations. It is useful for showing relationships between different variables as a network, or creating different types of maps. While you cannot export the final product, it is a great tool for gaining insights into any historical/geographic data.
An additional resource you might have overlooked is Google’s Data Studio. If you have a Google account, you can make clean, fully-customisable dashboards and visualisations to present analytical overviews — you can even embed text data from Google spreadsheets. Moreover, the dashboards you create will automatically update whenever the source data is updated. Data Studio’s reports are also trackable via Google Analytics, and integrates with other Google products like Google Sheets and Google Cloud Storage.
This last platform you may have heard of before (or even dabbled with it on occasion) during the apprenticeship. Plotly Chart Studio is the online platform for Plotly — a JavaScript package, but made to use with programs like Python and R. With the online platform, you can build interactive visualisations quickly and easily; simply upload your data, and choose the type of graph you want to make. You can even make dashboards that can be downloaded, embedded into an existing webpage, and shared with various privacy settings. It allows for some customisation, has a range of tutorials for assistance, but keep in mind that the amount of data you can upload on the web platform is limited.
In summary, there have been both an increase in tools available for effective visualisations, as well as examples of how to tell stories well (and not so well) using data. How will the examples shared in this article change the way you visualise data and tell your own stories?