Data Noir

“With my data and your visuals, we could go places.”

Stephen White
4 min readJul 27, 2018

Understanding the different data types will help you make better decisions about how data should be visualized.

Pairing a visualization with the wrong data type can hurt the effectiveness of your visual. Not understanding your data may cause you to slice and dice it the wrong way. Good news—it’s fairly intuitive to identify when you’ve chosen a complete mismatch of data type + visual… or perhaps applied a weird operation to your data.

Does it makes sense to visualize ratio data in a 🥧 chart? Before we decide, let’s get familiar with some data types!

Nominal

Nominal data can also be referred to as categorical or qualitative—there’s no numerical value or apparent order. Think of music genres… Hip-Hop, R&B, Pop, etc. In isolation, they can’t be mapped to a number or ordered in any meaningful way.

When nominal data is visualized, its usually as a chart that highlights the frequency of certain values.

Let’s collect some data:

We surveyed 100 people and asked them what’s their favorite genre of music.

Now there’s enough data to build a bar chart depicting the results. We can also give the genres a meaningful order—most frequent to least frequent. A pie/donut chart can be useful to highlight the percentages or parts of a whole (i.e. 44% of respondents prefer hip-hop). Sidenote: some people despise pie charts… but I think they’re okay when used appropriately.

Ordinal

In contrast to nominal data, ordinal data does have a meaningful order. A classic example is the Likert scale: Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree. However, assigning a number to one of the values still lacks meaning—you can arbitrarily decide to number those options 1–5 or 5–1. With ordinal data, you are usually looking to track the frequency/distribution of a given value.

Let’s collect some more data…

If you asked people that chose Hip-Hop as their favorite genre to name their favorite artist, most people would say, “Pusha T”. Using the Likert scale, do you agree or disagree?

After collecting answers, there would be sufficient data to create even more bar and pie charts. An important note about ordinal data is that the distance between each category may not be equal. Thus, its not recommended to calculate the mean of ordinal data. For example, if we were to use 1–5 (1 being Strongly Agree) and take an average of the responses… it’d be possible to get a result of 4.5. In that case, what would 4.5 really mean? Kinda-strongly agree 🤷‍♂️?

Anywho, visualizations that work for nominal data will usually be suitable for ordinal data.

Interval

Building upon Ordinal data… Interval data also has a meaningful order. In contrast, the distance between each value is known (hence the “interval”). An important distinction of Interval data is that it lacks a meaningful zero (or empty state). A common example is time. 1PM is one hour (distance) before (order) 2PM… but is there a real zero? Nope. There’s no such thing as zero o’clock or an absence of time… unless you wanna get weird. Because of the numerical nature of Interval data, you can start busting out mathematical operations in order to visualize/highlight certain properties.

Lets do the thing…

Let imagine we own a music streaming platform like Tidal or Apple Music, and we track the time of day that people listen to certain genres of music.

With this kind of data we can do a few cool things like:

  1. For a given day of the week, show a single metric that represents the average # of people that listening to each available genre of music (as well as the Standard Deviation).
  2. For a given date range, visualize trends in popularity of different genres.

You’ll often see Interval data as line and/or histogram charts due to their ability to identify trends or visualize distributions over time. Interval visualizations are also well-suited for display as single metrics because of their flexibility when dealing with mathematical operations.

Ratio

Last but surely not least is Ratio data. It continues the trend of building on the characteristics of the previous types. Ratio is like Interval, but with an important quality—it has a meaningful zero. Length or duration is a good example. With Ratio data, you can use a few extra mathematical operations such as multiplication and division on the values.

If we want to collect Ratio data we could:

Track the amount of time spent listening to a particular artist or genre of music.

Remember I mentioned that time is Interval data? That still stands. The difference in this case is that I’m referring to the duration of time. In which there is a true zero point. For example, I spent 0 minutes listening to Hair Metal 🤘. With that in mind, we can see how operations like multiplication work for Ratio data but not for Interval. Three minutes is 3 times longer than one minute. But what’s 4pm x 3?

Congrats! After reading this post, you now know your data types… and you won’t make the costly mistake of multiplying 4pm x 3 🎉. Or visualizing Ratio data in a pie chart. But like I mentioned, operating on and visualizing various types of data is a fairly intuitive endeavor. Most importantly, it’s good to know why certain visualizations should be used (or vice versa). Once you understand your data types, you can begin to make assumptions about when its ok to do things like sorting a column chart by its Y-axis value (this works when the X-axis is Nominal). No matter how trivial, a good design makes sense and has logic to support its structure. Data visualization is no exception.

Thanks for reading! If you like this article, follow me on Twitter and keep in touch!

https://twitter.com/stevoscript

--

--