Democratizing Data through Visualisation

Looking for stories beyond numbers.

Aldo Pradana
Tokopedia Data
8 min readNov 5, 2019

--

“Data! Data! Data! I cannot make bricks without clay!”

The famous detective Sherlock Holmes impatiently cried out these words once. The lack of details in a case he tried to solve unusually hindered his highly-inquisitive mind.

Written in one of Sir Arthur Conan Doyle’s stories about Mr. Holmes in 1892, these words could not be more relevant today. Especially here, in Tokopedia, where every analysis to identify issues, every approach to solve problems, and every initiative to make better impacts, needs to be evidence-based; as data-driven as possible.

That is why we, twelve Newkamas in the first batch of Advanced Analytics Academy, are here. The program, shortened as AAA, or A3, or A-Cube as we often referred to, is a three months development program for university fresh graduates coming from various disciplines yet having one interest in common: data.

A-Cube and Data

In A-Cube, we were exposed to multiple scopes of projects in Tokopedia Data Office. Mentored by expert Nakamas within the Data Office structure, we experienced their roles as Data Analysts, Business Intelligence, Data Engineers, and Data Scientists. Here, we witnessed firsthand how Tokopedia’s businesses rely, by a significant extent, on data and analytics.

Tokopedia’s 2019 Advanced Analytics Academy Trainees

Besides this on-the-job type experience, we also had a more classroom type experience where we learned end-to-end data works: starting from conversion and/or preparation of raw data into a more structured form where then analysis and/or modeling can be performed on; to transformation of such processed data into visualisation with insightful stories to drive a more informed decision-making for data users.

In this article, we would like to share an experience on the latter; applying what we learned in one of our classes, Storytelling with Data, in a humble effort to design an infographic covering the current state of education in our beloved country, Indonesia.

Firstly though, what is Storytelling with Data all about?

Essentially, data is just a collection of numbers until it is turned into a story. Showing data in its ‘basic’ and unprocessed form is often not the best way to tell the story in it. Without adding a narrative to the data, it can be too overwhelming for the audience to observe.

For example, the table above and line chart on the left contains the same data about the percentage of formal education participation for all educational level age groups in Indonesia since 1994 to 2018.

Compared to displaying the data in its tabular form, plotting the same data in line charts helps the viewers get more information beyond its numbers; in this case, we can also see the increasing movement of the trend throughout the years.

This reflects the importance of wisely choosing the type of visualisation so that it can tell a better story of the data.

Now, to effectively story-tell with data, there are several aspects that are of concern.

Knowing the Audience

First, we need to identify our target audience, people that we are going to tell our story to. Understanding what they seek in the data, their background, and their capacity can help us determine how deep we need to dive into the data when we present it: how detailed and technical should our presentation be. This, in turn, can signal us whether we have ‘enough’ information in our data, or if we need to seek more data to satisfy the information needs of our intended audience.

In the case of our infographic, our target would be the general, all-around, non-expert audiences. This means that we want the information in our infographic to be easy, direct and fast to consume, yet still gives out relevant and informative overview of Indonesia’s state of education.

Choosing the Appropriate Visualisation

As mentioned before, we need to be wise in choosing the ‘right’ type of visualisation. Here, we decide which chart or what visualisation is proper for our audience given the type of information. It is important to create intuitive visualisations; avoid using visualisations that are hard to understand at the first glance.

In creating this infographic, we came across several types of information with different appropriate visualisation types.

  • Simple Textual Data

In some cases where we have just a number or two to share, a simple text can be a great way to communicate. The fact that we have some numbers, does not mean that we need graphs to display them. A single number, supported with few words can be prominent and clear enough in stating a point.

  • Line Graph

Line graphs are most commonly used to plot continuous data, often when the continuous data is in some unit of time. This is due to the nature of line graph that connects one dot point to the next via a line, implying a connection between the points that may not make sense for categorical data. This easily helps us to identify trends and patterns in a data within a certain time-frame.

  • Bar Chart

While line chart is often used to visualise continuous data over time, bar chart is often used for categorical data. Note that, because of how our eyes compare the relative endpoints of the bars, it is important that bar charts always have a zero baseline to maintain consistent visual comparison. It is also helpful to sort the bars from highest to lowest (or vice-versa) to ease comparison between each and every bar.

  • Geographical Map Data
Example of geographically represented data

Geographical maps are often used to display data with regional variables in it, showing the comparison of a certain measure of a region with another. It allows us to see location-based patterns in data that may otherwise be overlooked if it is not shown in the geographical display.

When choosing a graph to visualise our data, sometimes we want to make it fancy with complex aesthetics. While aesthetics are important, it is necessary to make sure that those aesthetics are not there just for show, they are there to help the audience understand the main point we want to share easier and faster.

Focusing Audiences’ Attention

Another way to help the audience understands main points easier and faster is by emphasizing the important information, making them stand out as distinct from the rest.

People have something called pre-attentive processing. It is a subconscious process when our brain accumulates information when we observe something new. This is effortless and quick before the brain tries to enter higher processing level to select and focus on what is important from given observation. We can utilise these pre-attentive attributes to our advantage by highlighting important information in the visualisation: providing distinct colors, sizes, and thicknesses.

Using such visual cues enables our audience to see what we want them to see before they even know they’re looking at it, helping audiences capture important information in one blink of an eye before they enter higher processing level.

We can apply this pre-attentive attributes to both textual and graphical information.

Example of visual cues in a textual information
Example of visual cues in a graphical information

Declutter Graphs

Sometimes, we want to make our visualisation to be as informative as possible. We add another dimension from our data in the hope that it would enhance the informativeness of our visualisation. However, we need to ask ourselves: how much is too much information?

Adding too many dimensions or unnecessary cues to our visualisation can make the aggregate visual feel disorganised and uncomfortable to look at, giving the audience the burden in processing it.

Left: the original chart, Right: the decluttered version

For example, in our case above, we have an original chart on the left and a decluttered version of the same chart to the right. By removing some minor ‘unnecessary’ parts: removing the grid lines and cleaning up axis labels, we can improve clarity in the visualisation without reducing informativeness of the chart. This provides a clean, intuitive, easy-to-the-eyes, clearly marked, yet clutterless visualisation.

Think Like Designers

Be artsy: provide stimulating, catchy images to beautify the infographic and attract audiences’ attention.

Other dos and don’ts:

  • Use similar color palettes, do not put too many colors in one display. In our infographic, we mainly choose the green-ish Tokopedia palette.
  • Try to arrange the infographic in a logical order. Let the story flow.
  • Avoid using pie charts; they are horrible in providing relative distinction.
  • Avoid using 3D charts; they may be fancy, but the extra dimensionality rarely add any informational value.

The tips and tricks we learned and applied are based from the book Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer Knaflic. The ones mentioned in this article are only a select subset of materials that we implemented for the design of our infographic. More in-depth explanations and other examples of visualisations are covered in the book should there be any interest in learning further.

Our final infographic:

Data Sources

https://www.bps.go.id/subject/28/pendidikan.html

https://data.go.id/dataset?q=pendidikan&sort=score+desc%2C+metadata_modified+desc

https://data.worldbank.org/country/indonesia

https://www.theglobaleconomy.com/rankings/Student_teacher_ratio_primary_school/

http://visual.kemenkeu.go.id/anggaran-pendidikan-apbn-2019/

https://www.kemenkeu.go.id/media/11213/buku-informasi-apbn-2019.pdf

https://www.antaranews.com/berita/806586/kemendikbud-dana-bos-naik-rp800-miliar-pada-2019

http://www.dpr.go.id/doksetjen/dokumen/biro-apbn-apbn-Pembangunan-Bidang-Pendidikan-Perencanaan-Yang-Lebih-Fokus-dan-Berorientasi-Ke-Timur-Indonesia-Merupakan-Solusi-Atasi-Kesenjangan-dan-Percepat-Pencapaian-Target-Nasional-1434364286.pdf

With tutelage and feedback from Storytelling with Data facilitators: Kak Mellisa, Kak Qlea, Kak Chyntia, Kak Alviani, Kak Shahnaz and Kak Arie; joyfully designed and written by Aldous, Dian, Amanda, Aldo.

--

--