How to choose a suitable diagram?

Zhong Xiaoxue
4 min readMar 16, 2022

--

When faced with a set of data, we may be confused about what diagram to use. But we need to know that it is not the data that chooses the graph, but the opinion that we want to express.

The visualization you create depends on:

• The questions you are trying to ask

• The properties of your data

• How you want to present and communicate your insights to others

So, how to choose the right graph?

There is a framework for chart selection, starting from five correlations of data, composition, comparison, trend, distribution and relation. Then look at the categories, dimensions, dynamic or static status of the data. We can find suitable graphics according to this framework, and finally implement them in visualization tools such as python, R, Tableau, etc. In addition, in order to highlight our point of view well, such as a significant drop in sales in a certain region, a significant increase in the market share of a certain product or a certain indicator exceeding the warning value, etc., the display details of the diagrams can be adjusted in the visualization tool.

Composition mainly focuses on the percentage of the whole of each part. For static data, if the information you want to express includes: “share”, “percentage” and “what percentage is expected to reach”, you can use a pie chart at this time. It is worth noting that the pie chart presents data changes by area. When the proportion of each indicator is close, it is impossible to intuitively judge the size of the area. At this time, the bar chart is selected to present the data, and the sizes are clearer. To express the compositional relationship that changes over time, we can use stacked column chart and area chart.

Comparisons show the order in which things are arranged. Is it about the same, or is one more or less than the other? To compare relative relationships, when the amount of data is large and there are many types, table can be used. For many categories, you can choose from various types of column charts.

Trend is the most common time series relationship. It is concerned with how data changes over time: weekly, monthly, and yearly trends are increasing, decreasing, fluctuating, or basically unchanged. At this time, it is better to use a line graph to show the indicators trends over time. If there are fewer dates, you can use a bar chart to reflect the change in trend.

Distribution is referred with how many items are included in each value range. Typical information will include: “concentration”, “frequency” and “distribution”, etc. In this case, Histogram, scatter plot and bubble plot can be used for 1D, 2D and 3D data respectively. At the same time, it can also display different distribution characteristics through maps according to geographic location data.

Relations mainly refer to whether two or more variables express the pattern relationship we expect to prove. For example, the expected sales may increase with the increase of the discount rate. At this time, it can be displayed in a scatter plot, which is used to express the relationship between variables such as “related to…”, “increased with…” or “different with…”. When the data dimension increases, it can be expressed with bubble charts and radar charts.

In conclusion, to deliver a data visualization work in industry, the experience of choosing the appropriate chart to express the opinion is highly needed. Mastering the characteristics of different diagrams can make your visualization more logical and efficient in report.

Reference:

https://help.tableau.com/current/pro/desktop/en-us/what_chart_example.html

--

--