Data Storytelling: Making Sense of Complex, Multi-Dimensional Data with Parallel Coordinates Plots

4 min readApr 1, 2022

Parallel coordinates plots can be used to visualize high-dimensional data and give insight into the properties of a dataset. For example, parallel coordinates could help visualize the distribution of a dataset and understand any clustering present in it.

In this article, we’ll explore two practical applications of parallel coordinates — identifying trends/clusters within the IRIS dataset and optimizing hyperparameters for deep neural network training.

However, before we dive into this — let’s answer the biggest question — what is a parallel coordinates plot (PCP)?

Parallel coordinates is a visualization technique used to display multi-dimensional data to find structures, clusters, and other informative patterns in the data. A point in n-dimensional space is represented as a polyline with vertices on the parallel axes; the vertex’s position on the respective dimensional axis corresponds to the data point’s value in that dimension.

https://plotly.com/python/parallel-coordinates-plot/

As an example, we see the data points from the IRIS dataset plotted on a Parallel Coordinate Plot (PCP) — with the various attributes such as sepal_length and petal_width plotted on their axes. Each data point corresponds to a single line that cuts across all axes, representing its attribute values.

Why do we use Parallel Coordinate Plots?

When data are high-dimensional, representing each attribute marginally may lead to an incomplete or unclear visualization. Multidimensional graphs such as scatter plot matrices, glyphs, and parallel coordinates are proposed to facilitate multivariate data exploration.

In the case of the IRIS dataset, we can easily see specific attribute correlations by observing the clusters within petal_width and sepal_length. One can easily observe that SpeciesID 1 plants usually have shorter sepal length and shorter petal width compared to other species.

How do you prepare your data before using Parallel Coordinate Plots?

Imagine you have curated a dataset for measuring the best way or parameters needed to brew the best coffee. Maybe one of the parameters is the temperature of water used — however because coffee typically uses hot water, you may end up having a fixed range of temperature between 90–100 degrees Celcius, causing the graph to look like –

Plotted with Synthetic Data for Demonstration Purposes

This visual element doesn’t tell us much. This brings us to the first point of data preparation — scaling. There are several scaling techniques. Although not the most robust, the usual practice is to normalize each axis between its minimum and maximum values. With this procedure, the lowest value is set to 0, the highest to 1 (or 100%), and all other values are transformed accordingly. Applying this method to our coffee experiments, we yield the following visuals -

We notice that the relationship between temperature and score has become more apparent now. This is achieved by simply spreading the data out to look deeper into the trends.

Many other tricks are employed to improve the representation of data. Techniques includes:

Coloring — Usage of different color lines to visually separate categorical data from one another. In a continuous dataset, we use different color gradients for different objective values can be used to highlight differences between data points, especially in high-dimensional data.
Reordering — Some attributes or dimensions are correlated or have meaning when placed adjacent. Reordering your axes might improve readability — such as placing the volume of coffee adjacent to the amount of water used could potentially highlight the linear correlation.

Ultimately, it is the presenter’s responsibility to arrange the data in a way such that using Parallel Coordinate Plots improves the intuitive understanding of the data trends. Using these methods might improve the storytelling capability of your plots.

Real-world usage of Parallel Coordinate Plots

Machine learning engineers often select and configure a wide range of hyperparameters to train deep neural networks for their experiments. Hyperparameters are parameters whose values control the learning process and determine the values of model parameters that a learning algorithm ends up learning.

Hyperparameters can get very high-dimensional as more complex experiments require between 10–30 hyperparameters. These can include batch size, learning rate, number of training steps, etc. Intuition plays a big part as engineers tune these values — working out which hyperparameter correlates the most to lower the loss value.

In the industry, many deep neural network monitoring platforms offer Parallel Coordinate Plots to provide engineers with an intuitive view of which parameters matter using a combination of colors and axes arrangement. Here is an example from Datature’s MLOps Platform—

Neural Insights on Datature’s MLOps Platform features PCP (https://twitter.com/i/status/1479371384413253634)

Thank you for reading!

References

[1] https://archive.ics.uci.edu/ml/datasets/iris

[2] https://en.wikipedia.org/wiki/Polygonal_chain

[3] https://en.wikipedia.org/wiki/Vertex_(geometry)

[4] https://plotly.com/python/parallel-coordinates-plot/

[5] https://www.highcharts.com

[6] https://twitter.com/i/status/1479371384413253634

[7] https://towardsdatascience.com/parallel-coordinates-plots-6fcfa066dcb3

Data Storytelling: Making Sense of Complex, Multi-Dimensional Data with Parallel Coordinates Plots

Written by Hoki Fung