Visualization Tips for Data Story-Telling

Meredith Wang
10 min readOct 10, 2022

To Do & Not To Do, Good & Bad Charts

Image by Author

As someone with a background in fine art and photography, then later pivoted to data science, I found myself ‘aesthetic-driven’ when it comes to delivering dashboards/presentations to stake-holders.

“Maybe stories are just data with a soul.”
— Brené Brown

Visualizations along with story-telling play a big part in the entire data science pipeline. These two components are what give data a ‘soul’. Every data analyst has the analytical skills to do the analysis, and every data scientist has the ability to call scikit-learn functions to build models. Even though those are key to solving business problems, no matter how deep your analysis is and how fancy your models are, everything weighs less than a speck of dust without strong visuals and compelling storyline to backup.

I don’t know who can relate to this — I often find myself struggle to focus on the project itself when the visuals in someone’s presentation isn’t aligned with its purpose, or simply just “something is a little off.” Whereas when someone has stellar visuals, that will quickly grab me into the story they are trying to tell.

In this article, I would like to share some visualization tips that have been working for me, and some lessons I learned from my mistakes. I will walk you through examples from my past projects and visuals/dashboards I’ve seen on Tableau Public in 4 main aspects: Color, Text, Size, and Emphasis.

(NOTE: the examples listed in this article are solely for technical purposes, please do not take any judgements personally if your work is included.)

COLOR

Imagine you are going out and meeting new people, how does someone’s clothing color choice play a part of your perception of them? Color can invite interpretation of someone’s personality and emotion. That, along with other visual elements, would be the very first thing you notice before getting to know the person.

The same thing goes for data presentations.

When you are presenting your visuals, the first thing that will grab the audience’s attention is the color. Therefore, having a consistent and appropriate color scheme/theme throughout is crucial.

Keep it Simple

One of the most important fashion rules is “wear no more than 3 different hues”. For dashboards and data presentations, having a limited number of color hues is also preferred. Each color exists on your dashboard/visual should have a justified reason to be there. Before you decide to add another color on there, ask yourself this question: does it help communicating to the audience that they otherwise cannot comprehend? If the answer is no or you can’t explain to yourself, then DO NOT add the color.

Here’s an example I saw recently:

Source: http://citydashboard.org/london/

Notice there are at least 6 color hues that are screaming at you along with many large emphasized KPI figures. The diverse usage of color ruins the overall visual harmony. And it’s very distracting for the human eyes.

Compare to this example by Katie Kilroy:

Source: Tableau Public — Katie Kilroy

We see the background color plus the main color of visuals count for only 2 color hues throughout this dashboard. The way she cleverly combines the two colors together makes the important information stand out and allows the viewers to navigate easily.

Appropriate Color Scheme

Now you know the amount of color hues, let’s talk about what color to choose. Here we will discuss the most common types of color scheme and when to use them.

  • Monochromatic Color Scheme

A monochromatic color scheme only has ONE color hue, with a variety of tint, shade or tone. Once you have a base color in mind, you can simply type in the hex color code into a search engine, and you will see a variety of color palettes related to that color. For example, if I want to use the color #947ec3, I would search this color code to see different shades and tints of the color and compose a monochromatic color palette based on my preferences. Color Hex (image below) is one of my go-to websites.

Monochromatic Color Palette Example — Source: Color Hex

The most common use cases for monochromatic color palette are visualizing numerical value vs. categorical features, and distribution of categorical features.

The bar chart below is one of the visuals from my recent collaborative NLP project. This chart is to show the length (numerical) of README.md files of programming languages (categorical) related to the Metaverse on GitHub. By using a monochromatic color palette, assigning the darkest shade to the variable with the highest count and vice versa, viewers can easily understand the comparison between each bar.

Bar Chart — Image by Author

Make sure the order of color tones/shades correspond with the order of the numerical values. Below is a bad example I saw on Tableau Public:

Source: Tableau Public — 전라남도 공공보건의료지원단

In the lower graph, we can see the darker shade is correctly used to show the bar with the largest value. But the color shades do not follow the correct order of quantitative values in the upper graph. Small mistake like this might mislead/confuse the audience — which exactly is more? what’s your emphasis?

  • Complementary Color Scheme

A complementary color scheme has two color hues at the opposite position of each other on the color wheel. The image below is a great illustration of complementary colors including some commonly used color pairs.

Complementary Colors — Source: .Too

As the name implies, this color scheme is great to visualize comparison between categorical features. The animated bar chart below is from my Customer Churn Prediction project. This is to show the difference between customers who churn vs. those who do not.

Bar Chart — Image by Author

It’s also a great option to show the range of numerical features. The example below is from Data Viz for Nonprofits. We can see how the author uses two complementary colors to represent the two opposite groups, and different shades of the two hues to represent the quantitative value.

Source: Tableau Public — Data Viz for Nonprofits

We also frequently see complementary color scheme being used in heatmaps, with two colors representing the two extremities (positive and negative). The example below is from my Home Value Prediction project, visualizing correlations between features.

Image by Author

Keep in mind that when you’re using complementary color scale in a heatmap, make sure the two color hues are associated with the right side of the scale: warmer colors (red, orange, etc.) as positive and cooler colors (blue, green, etc) as negative; darker hues (higher saturation) represent stronger relationship whereas lighter hues (lower saturation) represent weaker relationship.

In the below example, notice the color scale of the “Coverage Power” is backwards. That’s the opposite of how our brain perceives color with certain descriptive words.

Source: Tableau Public

Here is another classic use case of complementary color palette: geo map.

Similarly, we can see how the two color hues are used to visualize the two sides of the scale, and the color shades are associated with the numerical value on each side.

Source: Tableau Public — Elias

Be Mindful

It’s very common to see people leaving their visual’s color as default. You have probably already seen these colors a million times if not more. I’m not saying there’s anything wrong with these colors, but it only takes one line of code to make your graph prettier and more unique.

Image by Author

There are tons of color palettes out there for you to explore, and you can also customize your own palette. If you want to know what built-in palettes Seaborn have available, please reference this article where I listed all the color palette values.

All in all, be mindful of your color choices —it’s one of the most important components that give your work a personality and makes your visuals more “alive”.

TEXT

Your dashboard/presentation is NOT a white paper. One of the biggest taboos is to have a screen full of text. Other than important key points and annotations, the rest is only going to be distractions. If you have already eliminated the extra text and the remaining part is still text-heavy, consider incorporating white space. Below is a great example.

Source: SlidesCarnival

Label

When you are creating the graph yourself, nobody is more familiar with the dataset than you are. You probably know every single detail of your graphs (which you should) without having any annotations, but the audience does not. Even when you have provided verbal/written context, the best practice is still to have a proper title, label for each axis for every chart in your dashboard/presentation.

Here’s is an example from my recent Social Media Engagement Forecasting project — the highlighted portions are the “must haves”.

Image by Author

SIZE

This sounds self-explanatory, but I can’t begin to tell you have many times I struggle to see the graph during presentations. A lot of people leave the size of the chart and text as default. This might not bother you as much when it’s solely a reference for yourself, but make sure the size is properly adjusted when you’re presenting visuals to others, especially on a big screen.

If you are creating the visuals with Seaborn or Matplotlib in Jupyter notebook, you can easily adjust the size in a few lines of code.

Here’s the side to side comparison of a small and a decently-sized chart using the code above:

Image by Author

In the small chart, the legend is squeezed in the middle of the graph — that’s a NO-NO. You would want everything, including the title, legend, axes’ labels, to be clearly visible.

EMPHASIS

What is the one or two most important things/KPIs you want your audience to takeaway in 5 seconds? Show them loud and proud on your dashboard/presentation.

Source: Tableau Public — LA NACIÓN

In this dashboard by LA NACION Data, we can clearly see the two big numbers next to the profile pictures. By enlarging the key numbers, you’re directing your audience by saying “hey, look here!”

Otherwise, having multiple graphs without one or two strong emphasis on anything, it’s hard for the audience to recognize what’s important.

Source: Tableau Public

Some immediate questions came to mind when I saw this dashboards were:

  1. Which graph should I look at first?
  2. Which color represents high score (red vs. blue)?
  3. Are all the red-color blocks represent the same scale (top left chart)?
  4. What’s the bottom left chart trying to tell me?

Besides the multiple color hues and inappropriate use of color scale, the fact the 4 charts are equally sized implies there’s no hierarchy among this dashboard. However, the author could easily add a couple large figures to indicate, for example, the user counts for the top games (e.g. “Cory in the House”).

The example below is the longest bar chart I have ever seen. Due to image size limitation, I was only able to show you 1/4 of the original chart. In cases like this when you have tons of variables on your y or x axis, displaying everything out on your chart is never the best idea.

Source: Tableau Public

You can limit the noise by focusing on the top 10 or the bottom 10, and get rid of the rest. Your audience would not have the time nor energy to zoom in to a lengthy chart and digest all the information. By strategically putting an emphasis on a small portion of your data, you would be able to make better use with the color scheme as as well.

BONUS

AVOID using pie charts!!! You have probably been told repeatedly that pie chart is great to represent distribution/proportion of the whole. But in the data science field, there are a lot of negative comments on this particular chart. Below is an example I included in one of my presentations. Knowing people dislike pie chart, I chose it intentionally and thought this is an appropriate use in this situation.

Most Common Languages in Metaverse — Image by Author

And of course, not surprisingly I was questioned why did I use this chart.

A pie chart could be misleading when there are too many elements included, or when the size of two proportions are extremely similar to each other. In this case, if I did’t use a darker shade for “text”, you might struggle to tell the difference between “Java” and “text” — which one is more?

Most Common Languages in Metaverse — Image by Author

Comparing to the bar chart above, the quantitative differences are much clearer and you can immediately recognize the “most” and the “least”.

Many people, including myself in the past, tend to use “fancy charts” to show off their skills. But the truth is, a simple bar chart or scatter plot conveys the message most effectively in many situations. Often times less is more, strong and effective story-telling starts with simple and clear charts.

I hope you enjoyed this article, and the tips are helpful for you. Please feel free to connect or reach out to me on LinkedIn. Talk soon :)

--

--

Meredith Wang

Data Scientist | Entrepreneur | Photographer | Visual Storyteller