DATA STORIES | DATA VISUALIZATION | KNIME ANALYTICS PLATFORM
Exploring the Power of R Graphics with KNIME: A Collection of Examples
Low-code data visualizations to enrich your data storytelling
After exploring the options of graphics with Python libraries and KNIME I would like to show you some special graphics examples where I have turned to using R and ggplot2 — you could always just create a ‘simple’ graphic with R and export it to KNIME or save it as a PNG or JPEG file.
If you want to read about how to set up KNIME and R you can read the official guide or my Medium blog: “KNIME and R — installation across operating systems”.
In this article there are not your standard graphics, since KNIME is already well equipped there with its generic nodes (see the list at the end of this article).
It started with a Data Scientist Venn Diagram
Someone on the forum asked about Venn Diagrams and I googled one classic example depicting the various tasks of a Data Scientist (Stephan Kolassa expanding Drew Conways original design) and put it in a workflow:
Heatmaps and Dendrograms to automatically create groups
If you want to get a quick overview of your data and variables you could use Heatmaps (and Dendrograms). Like in this example we have a database of cars and their features like horsepower and cylinders. You can use a R package to automatically group them based on how similar the cars are based on their specs. A dendrogram would further indicate which types of cars are more closely ‘related’. And you can have all that in a nice graphic:
The colours of the heat-map indicate the range and distribution of the values (maybe normalise them). The ‘teeth’ try to form a hierarchy you could then interpret — for the features (x-axis) as well as for the items (y-axis).
Note. Another interesting technique to identify new groups and reduce dimension is T-SNE.
Stacked Bars within Grouped Barplot in R
In the next example a lot of information is being presented within just one chart. You have three groups you can use to display their numbers:
- One overall category A, B, C … (could be regions)
- then you have elements forming the different columns within the categories (years maybe) and then
- you have values for groups within the columns (the distribution of some sales items maybe)
You might need some effort to try and interpret this, but hey it can be done:
This was another question from the KNIME Forum.
Plotting Means with an Error Bar
In this example we plot different lines/means (y-axis) across values on an x-axis and give the data points error bars to indicate a range of values:
And yes, this was another question from the Forum.
Identify a Bi-Modal Distribution
The next example following another KNIME Forum post uses a (now outdated?) R package “modes” to on the one hand calculate a coefficient to detect a structure of two peaks but also provide a graphic to visually inspect the results. I used the original code from Github and put it into a R node:
These example might encourage you to seek additional packages and solutions in R or Python if you see a problem where there is maybe not an immediate solution with generic KNIME nodes. You can check out the KNIME Community Hub for more R examples.
These days with ChatGPT and other Large Language Models it is even easier to put together a few lines of code and create graphics (and other stuff). See also my article: “KNIME, ChatGPT and Python”.
The Beautiful Violin Plot
… and there already is a separate article about powerful Violin Plots with R:
If you enjoyed this article you can follow me on Medium (https://medium.com/@mlxl) and on the KNIME forum (https://forum.knime.com/u/mlauber71/summary) and hub (https://hub.knime.com/mlauber71).