How to Implement Good Data Visualizations

Elizabeth Widjaja
tiket.com
Published in
8 min readFeb 24, 2023

As data scientists, we must be familiar with data visualizations. But do we actually know how to implement a good data visualization or do we have the ability to tell a story using our visualizations? Data visualization is fundamental as we seek to portray data in ways that allow us to see it in a new light, to visually observe patterns, trends, exceptions, and the possible stories that sit behind its raw state, as John W Tukey quotes:

“The greatest value of a picture is when it forces us to notice what we never expected to see” — John W Tukey

Here is a simple example of a data transformed into a visualization:

We take our raw data and transform them into appealing visual graphs.

When we look at the data, what can we see in these sets of data? Are there any patterns or trends that jump out?

The use of data visualization is to discover patterns and trends within our data.

After transforming the data into a visual graphical display, we can immediately see the prominent patterns created by the relationships between the X and Y values across the four sets of data:

  • the general tendency about a trend line in X1, Y1
  • the curvature pattern of X2, Y2
  • the strong linear pattern with a single outlier in X3, Y3
  • the similarly strong linear pattern with an outlier for X4, Y4

As we can see from the example above, data visualization is about a discovery process, enabling the reader to move from just looking at data to actually seeing it. Rather than just describing a dataset based on a selection of some of its key statistical properties, we need to also employ visualization techniques in order to avoid forming false conclusions.

To implement a good data visualization, we first have to keep in mind of our audience. The task for us as the designer is to put ourselves in the shoes of the reader, to imagine, anticipate, and determine what the reader is seeking from our message. The important point is to ensure that our message is conveyed in the most effective and efficient form, one that will serve the requirements of the receiver. We need to make sure we design (or “encode”) our message in a way that actively exploits how the receiver will most effectively interpret the message through their visual perception capabilities.

Data Visualization Methods

To visualize our data, first we need to assess the data we have in order for us to determine the right chart to use. Choosing the right chart in regards to the data we have is an essential step as it will help us communicate our insights and message effectively through our visualization. Key steps on doing this are as follows:

  • identify the type of data we have (e.g. numerical or categorical) to help narrow down the choices of graph we are able to use and which graphs suits the variable types present in our data
  • think about the purpose of what we want to present through our chart and make sure that our chart will be able to communicate and convey our message to the audience
  • focus on the number of variables that we have in our data and how many variables that we want to display in our chart (e.g. to show the relationship between two or multiple variables)
  • our target audience and the level of technical knowledge they possess will be take into consideration to help us determine a chart best suited for the audience

Types of Charts

We will now be looking at a few chart examples that are unique and uncommon. I took these examples from Andy Kirk’s Data Visualization: a successful design process book.

Categorical values

  • Slopegraph:

Data variables: 1 categorical + 2 quantitative

Visual variables: Position, connection, color-hue

  • Radial chart:

Data variables: Multiple categorical + 1 categorical ordinal

Visual variables: Position, texture, color-hue, color-saturation/lightness

  • Glyph chart:

Data variables: Multiple categorical + multiple quantitative

Visual variables: Position, size, color-hue, shape

Hierarchies & Part-to-whole relationship

  • Bubble hierarchy:

Data variables: Multiple categorical + 1 quantitative-ratio

Visual variables: Position, area, color-hue

  • Tree hierarchy:

Data variables: 2 categorical + 1 quantitative-ratio

Visual variables: Position, angle/area, color-hue

  • Horizon chart:

Data variables: 1 categorical + 2 quantitative-ratio + 1 quantitative-interval

Visual variables: Height, slope, area, color-hue, color-saturation/lightness

  • Barcode chart:

Data variables: 3 categorical + 1 quantitative-interval

Visual variables: Position, symbol, color-hue

Plotting connections & relationships

  • Parallel sets:

Data variables: Multiple categorical + multiple quantitative-ratio

Visual variables: Position, width, link, color-hue

  • Radial network:

Data variables: Multiple categorical + 2 quantitative-ratio

Visual variables: Position, connection, width, symbol, size, color-lightness, color-hue

  • Network diagram:

Data variables: Multiple categorical-nominal + 1 quantitative-ratio

Visual variables: Position, area, color-hue, connection

Mapping geo-spatial data

  • Choropleth map:

Data variables: 2 quantitative-ratio + 2 quantitative-interval

Visual variables: Position, color saturation/lightness

  • Bubble plot map:

Data variables: Multiple categorical + 1 quantitative-ratio

Visual variables: Position, area, color-hue

  • Isarithmic map:

Data variables: Multiple categorical + multiple quantitative

Visual variables: Position, color-hue, color-saturation, color-darkness

  • Dorling cartogram:

Data variables: 2 categorical + 1 quantitative-ratio

Visual variables: Position, size, color-hue

Evaluating our Design

Evaluating our data visualization allows us to assess the effectiveness of our visualization in communicating our message and insights we intended for the reader.

Good data visualization should be clear, accurate, relevant, impactful, and well-received by the audience, while a bad data visualization can be misleading, unclear, or ineffective in communicating its message.

Paying attention to the finer details of our work will safeguard the project’s integrity.

When evaluating our visualizations, we should be mindful of a few things:

  • Visual inference: Make sure that our visualization can help the audience to quickly understand the context and interpret the meaning of what is being seen of our data.
  • Clarity: The primary purpose of data visualization is to effectively communicate insights and information. Evaluating our visualization will help us determine if it is clear, easy to understand, and effectively conveys our message.
  • Accuracy: Helps us to determine if it accurately represents the data. This includes checking for any errors, such as incorrect labels or scales, and verifying that the data is being represented in a meaningful way.
  • Formatting accuracy: Check the consistency of our typography, in terms of font type, style, and size. Make sure our color usage is consistent down to the RGB or CMYK code level.
  • Annotation accuracy: We should read through all your titles, labels, introductory text, credits, captions, and check any units that we have included. It’s not just about spelling or grammatical errors but checking to see if things make sense and are correctly expressed.
  • Relevance: Helps us determine whether it is relevant to the audience and their needs. This includes considering whether the data and insights being presented are relevant to the audience and if the visualization is engaging and informative.
  • Impact: Helps us determine if it is having the desired impact. This includes considering if the visualization is effectively communicating our insights and if it is leading to informed decisions or actions.
  • Feedback: Give us an opportunity to gather feedback from the audience so we can make improvements and ensure that it is effectively communicating our message and insights to the audience.

Using Visual Analysis to find Stories

After visualizing our data, we focus on editorial which is the bridge that connects data work and design work. This is when we prepare to publish our findings. Editorial focus is about how we can tell a story through our visualizations. Fundamentally, this is about using visual analysis to find stories. The journalistic capability for unearthing the most relevant stories from data is a talent that any designer should aspire to develop. There will be numerous ways of telling data stories.

In this following example, we see a visualization project that was developed to enlighten people about the matter of education around the world, presenting some striking facts and figures:

The strength of this particular project comes from the scoping and definition of the chosen narrative and slices of analysis. Rather than bombarding the reader with endless pages of facts and figures, or offering seemingly infinite combinations of interactive variable selections, the subject is framed for us around a small number of interesting angles about education: literacy by region, literacy rates by country/ gender, enrollment ratios, and expenditure on education versus military. As we navigate through each story panel we are presented with a series of explanatory visualizations. They don’t just show data, they present and explain it.

This is a scatter plot of education spend versus military spend for all countries. The designer takes responsibility for telling the story, providing effective written (labeling and captions), and visual annotation (reference lines and background shading) to help maximize the potential insights. The inclusion of filtering features to highlight particular countries and regions introduces an exploratory dimension to enable the discovery of further layers of understanding. This is a strong demonstration of editorial focus and storytelling with data. What we see with this project is a visualization that answers “data questions”. Data questions are the lines of interrogation and the dimensions of interpretation users will likely seek to pursue when reading a visualization design.

Editorial focus is more than just framing a story, it is about the specific insights we are making accessible. We want our visualization to be able to respond to the most likely and relevant questions a user will raise about the data and the subject matter.

With that being said, as data scientists, let us implement good data visualizations to gain valuable and comprehensible insights from our data.

References

Kirk, A., (2012), Data Visualization: a successful design process, Packt Publishing Ltd.

Data Visualization Color Pitfalls

Bad Data Visualization Examples

Misleading Data Visualization Examples

Worst Data Visualizations Ever Created

--

--