GETTING STARTED | DATA VISUALIZATION | KNIME ANALYTICS PLATFORM

KNIME — Getting our hands dirty with data visualisation, Part 1

Build interactive visualizations in a fully codeless fashion

Stack Errors
Low Code for Data Science

--

There is a buzz around Matplotlib, Plotly and Seaborn. It’s a common tendency for the non-coders just to walk miles away from these libraries and long hours of code-intensive sessions. This actually creates a sense of demotivation amongst them. But what if one can still create all these tech-savvy visual graphics within a small span of time, that too with zero coding experience?

Welcome to the third article in the KNIME Series where “StackErrors” have tried to give a gist of a few visualization options using KNIME. You can walk around our previous articles using the link below. It will help you understand this article better.

KNIME — A key lime pie for data scientists.

KNIME — Data Preprocessing, undusting the data for modelling.

In this article, we would be covering some KNIME’s visualization nodes that are available under the JavaScript section of the Views nodes (you need to install the KNIME JavaScript Views extension). But before getting our hands dirty with visualizations, let’s have a quick glance at the analysis techniques.

  • Download the Titanic dataset from Kaggle here to follow along!

There are multiple ways of analysing the dataset. Univariate Analysis, It’s the most basic way of analysing the data. As the name signifies, Univariate analysis describes a single variable only. In contrast, Multivariate Analysis takes care of multiple variables, where the relationship between different variables can be analysed and viewed.

Generally, Pie charts and Bar charts are the best ways to plot categorical data. Whereas Histogram, Box Plots, and Scatter Plots are quite helpful for numeric data. Let’s dig deep with KNIME and understand the functionalities of each node using these visuals.

Pie Chart

The Pie/Donut Chart node is probably the most basic view available to plot categorical data. One can select a Pie Chart node in Views>JavaScript>Pie/Donut Chart. In the node configuration window, you can select the category column from the Options tab. Label type is an interesting feature which gives the user the functionality for specifying the final values either in key, value or percentage. Also, the General Plot Options tab allows changing the title and subtitle according to the user’s requirement.

Have a look at the Pie Chart created for the split of males and females in the Titanic Survival Dataset.

Bar Chart

When it comes to categorical data, a Bar Chart is the handiest choice among the graphs, and so is the Bar Chart node in KNIME. Right-click on configure, and options can be seen. One can select the category that needs to be represented and also the aggregation method as count, sum or average. Apart from that, one can manually select the column for aggregation. Here we select the column “Survived” (values are 0 and 1). So the typical way of plotting survived passengers is computing the sum of survivals, which would give the count of survived passengers.

As shown in the graph above, among survived passengers, 233 were female vs. 109 male. The General Plot Options tab allows the users to do the housekeeping for the chart such as adding the subtitles, labels etc. There is also a second input port in the Bar chart node, which is specially useful to input colour coding for each bar.

Histogram

The plot created by Histogram node is similar to that of the Bar Chart but has buckets/bins instead of nominal categories. The main difference between the Histogram and the Bar Chart is that in the histogram we have to transform numerical data into bins. KNIME’s Histogram node can do that in the Binning tab where bins can be created according to user’s needs.

Here we are concluding this article. In the next article, we will cover the Scatter plot, Line Plot and Box Plot with the miscellaneous section that includes the Table Creator and Color Manager nodes. Till then have fun with KNIME :)

Reach out to the KNIME Community to get more information.

StackErrors is managed by Ankita91 and Sreedev. Follow Stack Errors in Kaggle to explore our data science projects.
Let’s learn together. 💙

--

--

Stack Errors
Low Code for Data Science

Data Scientists pursuing AI and Data Science at Loyalist College, Toronto. Handled by: Ankita and Sreedev teamed up as Stack Errors