GETTING STARTED | DATA VISUALIZATION | KNIME ANALYTICS PLATFORM

How to create a Sunburst chart in KNIME?

An easy how-to tutorial with just 3 nodes

Robin von Malottki
Low Code for Data Science

--

Designed by GarryKillian from Freepik.

In this article, I show how to create a sunburst chart using the open source tool KNIME Analytics Platform. For this, I will use the Superstore dataset that you can download from Kaggle.

What is KNIME?

KNIME is a free and open-source data analytics, reporting, and integration platform that provides a graphical user interface for designing and executing data workflows and analysis. KNIME stands for Konstanz Information Miner: It is used for data preparation, data mining, text mining, predictive analytics, and machine learning. KNIME is designed to be a user-friendly and flexible platform that allows users to integrate different tools and data sources into a single environment. It provides a visual interface that enables users to drag and drop nodes to build workflows and analyze data. With its modular architecture, users can also add their own custom nodes or extensions to extend its functionality.

What is a Sunburst Chart?

A Sunburst chart, also known as a radial treemap or multi-level pie chart, is a data visualization that is used to represent hierarchical data structures in the form of nested rings. Each ring in a Sunburst chart represents a level in the hierarchy and each segment within a ring represents a subcategory or a child node of the parent category. The size of each segment is proportional to the data value associated with it. The center of the chart represents the root node of the hierarchy, and the outermost ring represents the leaf nodes.

Sunburst charts are particularly useful when you have hierarchical data that you want to represent in a compact and visually appealing way. They provide a way to show the proportion of data values at each level in the hierarchy and also allow you to compare the sizes of different segments and subcategories within a hierarchy. Additionally, they provide an intuitive way to navigate the hierarchy, as you can zoom in on a specific segment by clicking on it to reveal its child nodes, or zoom out by clicking on the parent node.

Superstore Dataset

The Superstore dataset is a widely used sample dataset in the data analysis and visualization community that provides a realistic scenario for practicing and demonstrating data analysis and visualization skills.

An example of this data set looks like this:

Image by author.

This data contains, among other things, information on how much sales were made in which category and in which sub-category. Since we need hierarchical dimensions for our Sunburst chart, these two columns are particularly suitable for this. For the key figures, we simply choose the Sales column.

In order for us to create a Sunburst chart, we need to prepare the data so that all dimensions are in a clear hierarchical structure. Therefore, we need to prepare the data first.

Creating the workflow in KNIME

The workflow we want to create looks like this:

Image by author.

Before we can prepare data, we must first load the Superstore data into the workflow. For this, I use the CSV Reader node. After we load the data, we aggregate the values. We sum up all the sales so that we have the sales for each category and sub-category. For the aggregation of rows, you usually use the GroupBy node in KNIME.

We then receive the data in the following form:

Image by author.

Then, we use the Sunburst Chart node to create the chart. For the configuration, we just need to include the dimensions in the green window on the right, on which we want to split the selected metric. Our metric of interest is Sales. We select it below.

Image by author.

If we run the workflow, we get the following as the first view:

Image by author.

The visualization is interactive in KNIME. If you select a field, you can either see the absolute amount or the percentage of the total sales, depending on the selection. If we ask ourselves whether we make a proportionally larger sales with one of the three upper categories (Furniture, Office Supplies, Technology), then we simply look at the inner ring. We can see here that all three categories account for roughly the same share of total sales. On the other hand, we can also see directly that Phones accounts for a proportionally larger share of sales in the Technology category.

KNIME is not predominantly a visualization tool like Tableau or Power BI. Nevertheless, you can quickly create visualizations using prebuilt functionality, and then create specific visualizations faster deploying data tables to Tableau or Power BI.

--

--

Robin von Malottki
Low Code for Data Science

Chess player, data enthusiast, and lifelong learner with a passion for solving complex problems.