DATA STORIES | DATA VISUALIZATION | KNIME ANALYTICS PLATFORM

Exploring the Power of R Graphics with KNIME: A Collection of Examples

Low-code data visualizations to enrich your data storytelling

Markus Lauber
Low Code for Data Science
4 min readMar 23, 2024

--

After exploring the options of graphics with Python libraries and KNIME I would like to show you some special graphics examples where I have turned to using R and ggplot2 — you could always just create a ‘simple’ graphic with R and export it to KNIME or save it as a PNG or JPEG file.

If you want to read about how to set up KNIME and R you can read the official guide or my Medium blog: “KNIME and R — installation across operating systems”.

In this article there are not your standard graphics, since KNIME is already well equipped there with its generic nodes (see the list at the end of this article).

It started with a Data Scientist Venn Diagram

Someone on the forum asked about Venn Diagrams and I googled one classic example depicting the various tasks of a Data Scientist (Stephan Kolassa expanding Drew Conways original design) and put it in a workflow:

A classic “Data Science Venn Diagram” created in KNIME with R
A classic “Data Science Venn Diagram” created in KNIME with R (https://hub.knime.com/-/spaces/-/~N_GgnJyIXtRgF-sL/current-state/).

Heatmaps and Dendrograms to automatically create groups

If you want to get a quick overview of your data and variables you could use Heatmaps (and Dendrograms). Like in this example we have a database of cars and their features like horsepower and cylinders. You can use a R package to automatically group them based on how similar the cars are based on their specs. A dendrogram would further indicate which types of cars are more closely ‘related’. And you can have all that in a nice graphic:

Automatically group different types of cars based on their features
Automatically group different types of cars based on their features (https://hub.knime.com/-/spaces/-/~vNZwHIyUNzpizstR/current-state/).

The colours of the heat-map indicate the range and distribution of the values (maybe normalise them). The ‘teeth’ try to form a hierarchy you could then interpret — for the features (x-axis) as well as for the items (y-axis).

KNIME allows you to organize and control your data exploration with Heatmaps
KNIME allows you to organise and control your data exploration with Heat-maps (https://hub.knime.com/-/spaces/-/~vNZwHIyUNzpizstR/current-state/).

Note. Another interesting technique to identify new groups and reduce dimension is T-SNE.

Stacked Bars within Grouped Barplot in R

In the next example a lot of information is being presented within just one chart. You have three groups you can use to display their numbers:

  • One overall category A, B, C … (could be regions)
  • then you have elements forming the different columns within the categories (years maybe) and then
  • you have values for groups within the columns (the distribution of some sales items maybe)

You might need some effort to try and interpret this, but hey it can be done:

A complex combination of bar charts with the help of KNIME and R
A complex combination of bar charts with the help of KNIME and R https://hub.knime.com/-/spaces/-/~JmHl8hI0O5XGnUEp/current-state/.

This was another question from the KNIME Forum.

Plotting Means with an Error Bar

In this example we plot different lines/means (y-axis) across values on an x-axis and give the data points error bars to indicate a range of values:

A way to plot ranges at data points in line charts
A way to plot ranges at data points in line charts (https://hub.knime.com/-/spaces/-/~horTEQi1JBXt0_Fs/current-state/)

And yes, this was another question from the Forum.

Identify a Bi-Modal Distribution

The next example following another KNIME Forum post uses a (now outdated?) R package “modes” to on the one hand calculate a coefficient to detect a structure of two peaks but also provide a graphic to visually inspect the results. I used the original code from Github and put it into a R node:

Identify a bi-modality with R and KNIME
Identify a bi-modality with R and KNIME (https://hub.knime.com/-/spaces/-/~GEONBF327vz7VU3s/current-state/)

These example might encourage you to seek additional packages and solutions in R or Python if you see a problem where there is maybe not an immediate solution with generic KNIME nodes. You can check out the KNIME Community Hub for more R examples.

These days with ChatGPT and other Large Language Models it is even easier to put together a few lines of code and create graphics (and other stuff). See also my article: “KNIME, ChatGPT and Python”.

The Beautiful Violin Plot

… and there already is a separate article about powerful Violin Plots with R:

A Violin Plot with all its statistics to interpret the differences in a numeric variable between groups
A Violin Plot with all its statistics to interpret the differences in a numeric variable between groups (https://hub.knime.com/-/spaces/-/~oRO8gC-xpXF5bVjh/current-state/)

If you enjoyed this article you can follow me on Medium (https://medium.com/@mlxl) and on the KNIME forum (https://forum.knime.com/u/mlauber71/summary) and hub (https://hub.knime.com/mlauber71).

--

--

Markus Lauber
Low Code for Data Science

Senior Data Scientist working with KNIME, Python, R and Big Data Systems in the telco industry