GETTING STARTED | DATA VISUALIZATION | KNIME ANALYTICS PLATFORM

KNIME — Getting our hands dirty with data visualisation, Part 2

Build interactive visualizations in a fully codeless fashion

Stack Errors
Low Code for Data Science

--

Welcome to the fourth article in the KNIME Series where “StackErrors” have tried to give a gist of a few visualization options using KNIME. This is the continuation of our last article. If you haven’t read our last article feel free to surf around in the below link. It will help you understand this blog better.

KNIME — Getting our hands dirty with data visualisation, Part 1

In this article, we are covering the Line Plot, the Box Plot, and the Scatter Plot nodes. We will also have a look at the Color Manager and the Table Creator nodes. All the nodes above are available for free in the Node Repository of KNIME Analytics Platform or on the KNIME Hub.

  • Download the Air passengers dataset from Kaggle here to follow along!

Let’s jump to it!

Line Plot

It is used to plot numeric values as a line for a chosen categorical column. The Line Plot node can be configured quite like the previous graphs. One must choose a categorical column for the x-axis, and one or more numeric column(s) on the y-axis.

As shown in the node configurations, the x-axis displays months, and the y-axis displays the number of air passengers boarding from three major cities.

Some pro-tips for better visualization of the graph:

  • Select the checkbox “Create image at the outport” to create an SVG image as output
  • Name the appropriate labels for the x and y-axis from the Axis Configuration tab
  • Give appropriate chart title and subtitle from the General Plot Option tab

Box Plot

It is used to visualize and identify the presence of outliers. Bold lines at the top and bottom are the maximum and minimum values respectively. Horizontal lines in the box plot give us the interquartile range. Outliers are plotted as individual dots outside the Box Plot whiskers. Similarly to other visualization nodes, the Box Plot node can also be found within the JavaScript View section of the Node Repository.

Here in the Box Plot, one can see that the average age is 32 years, with minimum and maximum being less than a year and 80, respectively. X at the top, with a value around 1000 indicates the presence of an outlier.

Scatter Plot

It is used in multivariate analysis to visualize and explain the relationship between two variables. In KNIME Analytics Platform, we can use the Scatter Plot node. Column selection and axis labeling, including title and subtitle, are done in the same way as for the Line Plot.

Here, the relations between fares and passenger classes could be easily visualized using the Scatter Plot node. Fares for class P1 are certainly higher than P2 and P3, making it the elite class for travelling. On the contrary, there is not much difference in the fares for classes P2 and P3.

Color Manager

The Color Manager node can be found in the section View>Property of the Node Repository and allows users to assign colour to the values of a chosen column. In the above-shown Scatter Plot, the dots are black. One can change the colours based on a chosen feature to make a better distinction.

  • Click on the Color Manager’s Color Settings tab
  • Select the column whose values you want to colour
  • Click OK

Tabular data with coloured rows can be seen in the “Table with Color” output of the node.

Also, the output of this node can serve as the input to the next node. Let's say that the “Survived” column is coloured.

Now using this node output as an input to the Scatter Plot node, we will display the results of the Scatter Plot along with the colour distinction for the survived and non-survived classes.

Have a look at the below image to get a better idea.

Table Creator

There are instances when the user needs to manually create static data or add a table in a spreadsheet-style. For this, it is not necessary to read the file using the reader nodes. KNIME has a node that allows to do that in your workflow.

The Table Creator node could be found under the IO section of the Node Repository. One has to double-click on the node table, and add the data manually or can copy-paste the values.

Double-click on the column to rename the column header.

A Dialogue Box will appear where one can change the column name and the type, as shown below:

Also, there is an option to delete row(s) by selecting the required row, right- clicking and deleting it.

Have a look at the manually created table.”?” in red color indicates the presence of missing values in the data.

With this article, we conclude our visualisation series with KNIME. We will be showcasing how to build a model using KNIME Analytics Platform in the subsequent article. Till then stay tuned and have fun with KNIME :)

Reach out to the KNIME Community to get more information.

StackErrors is managed by Ankita91 and Sreedev. Follow Stack Errors in Kaggle to explore our data science projects.
Let’s learn together. 💙

--

--

Stack Errors
Low Code for Data Science

Data Scientists pursuing AI and Data Science at Loyalist College, Toronto. Handled by: Ankita and Sreedev teamed up as Stack Errors