The science behind the data N°3 — The advantages of Parallel Categorical Diagrams


A multimodal feature is a special case of features computation, which therefore requires special visualization tools.

Here we will present multimodal feature examples and compare three different visualization methods based on a use case developed by Ad Scientiam.

Features with modality

Massive amounts of data are continuously generated all around the world these days. Most of the time, the collected data is raw and requires interpretation as well as transformation processes in order to turn it into a meaningful and goal-specific measure called a feature.

An example of such a feature would be the transformation of a GPS signal (e.g., x and y position) into a walking distance (in meters), which constitutes a clear and straightforward measure.

What would be the purpose of doing that? At Ad Scientiam, we design and validate digital biomarkers that continuously measure disease progression in real life. In this way, we help people with severe and disabling diseases to better understand and monitor their illness. We are paving the way for healthcare professionals to offer more individualized care more efficiently. We are accelerating the development of future treatments and reducing the societal and economic impacts of disease.

In this context, our data team oversees the creation of features that, through clinical validation, will become disease biomarkers. From the raw data provided by the mobile phone, we compute the said features that can then be fed into machine learning models in order to evaluate the patient’s condition, to track the disease progression or to predict treatment response, for instance.

However, as some data enthusiasts already know, machine learning generalization can be a delicate process if the input features are poorly defined.

Indeed, in the field of data-analysis, features are often subject to various biases that influence the transformation process.

One known way to limit these biases is to introduce environment parameters called modalities. Multiple modalities can affect one feature, allowing the generation of multimodal features.

Thus, a multimodal feature is a representation of the same archetypical measure, but in a sub-environment associated with its dependent modalities.

As an example, let’s take one of our active tasks available in the MSCopilot® application: the Dexterity Test.

The dexterity Test is a specific task for Persons Living with Multiple Sclerosis (PLwMS) proposed in our medical device application MSCopilot®. This test measures the patient’s upper-limb dexterity and hand-eye coordination. During this test, PLwMS are instructed to use their finger to draw different shapes displayed on their phone.

Based on the raw screen data collected during the dexterity task, it may be possible to capture patients’ performance via the computed distance between the displayed shape and the user drawing, i.e., the similarity score. However, reducing the user performance to one unique measure introduces an inherent bias due to its aggregation across multiple shapes.

In fact, the hypothesis behind data aggregation implies that all sub-environments fall within the same representational category. Aggregating the similarity scores of the spiral and line shapes is therefore inappropriate, since the level of difficulty involved in tracing these two shapes differs.

Thus, a similarity score needs to be created for each shape. The displayed shape then constitutes a modality for the similarity feature.

It is possible for features to depend not just on one but on multiple modalities. In the dexterity test, for instance, it is known if the shape was drawn using the dominant or the non-dominant hand.

Thus, an archetypical feature could be configured for multiple modalities that can be visualized, analyzed and compared.

In case of features with multiple modalities, visualizing their relationship can be quite challenging. In fact, traditional visualization tools allow us to study only one or two modalities at a time (when using multiple violin plots, for example). However, studying a higher number of features together, along with their intricate dependencies, might be a laborious task, as we will see shortly.

To go back to the dexterity test example, if multiple modalities are available, such as the dominant/non-dominant hand and the shape, we might be interested in looking at the impact of the type of hand used on the shape, and vice versa.

The idea is to be flexible regarding the modalities order of the visualization in order to study their impact on the feature more easily.

Multimodal features visualization ideas

Visualization is an important step when exploring the information contained in a feature dataset. Consequently, the visualization tools used should be adapted to the features, which in our case are the multimodal ones. Several choices are available to explore these types of data, from basic histograms to fancy mosaic views.

Using the example of a test called MyEmotion, we will compare the adaptability of these traditional visualization tools by presenting this test’s features via a histogram, a mosaic view, and lastly, a parallel categorical diagram.

The MyEmotion Test is a specific task proposed in Redress, our mobile application for patients with major depressive disorder. It is currently being evaluated in a clinical study to assess its ability to predict antidepressant response in real-life. This project is a collaboration between Ad Scientiam and Janssen Pharmaceutica NV, part of the Janssen Pharmaceutical Companies of Johnson & Johnson. It was facilitated by Johnson & Johnson Innovation.

MyEmotion consists of recognizing the emotion seen on a human face that is quickly displayed on a phone screen. Four emotions {anger, happiness, sadness, fear} performed by professional actors {woman, man} are shown with expressions of varying degrees of intensity {very low, low, moderate, high, very high}.

Here, to evaluate the pertinence of each tool, we’ll visualize the correct answers given by a pool of healthy control subjects from a total of 214 evaluations.


Let’s start with the histogram that reports feature value frequency. We will visualize the correct answers feature across multiple subsets, where each subset corresponds to one of the following modalities:

  • emotion: {angry, happy, sad, afraid}
  • actor: {woman, man}
  • intensity: {very low, low, moderate, high, very high}
  • answer: {correct, incorrect}

What can we interpret from this figure? Not as much as one could hope. Indeed,
this representation is rather difficult to read. Furthermore, the link between
the modality states that would be especially interesting to explore is missing, making this tool less useful. Let us consider the next option.

Mosaic view

Seeing that histograms may not be the most appropriate tool for this kind of visualization, let us try using the mosaic view. This visualization tool can also project the same information as above, but under the form of ratio areas. The more frequent the feature value in the dataset, the larger the represented area.

Is it possible to glean insightful information from this visualization? Most likely not. While the link between the correct and incorrect answers appears here, the modality link between the intensity states is still missing, which limits the information gained.

Overall, the two visualization tools described above suffer from the same weaknesses:

  • A poor readability.
  • An unclear representation of the relationship between the modality states.
  • A maximum projection of three modalities.

Indeed, in the presented examples, the overlapping or repeated axes lead to numerous information elements being displayed in the same figure subspace, which makes it hard to read the charts and to have a global view.

In addition, the intra-modality relationships are difficult to interpret, as the final graph is static, making it difficult to inverse the modality order in the projection.

Finally, the projected modality maximum number is three, which is quite a limitation in real-world applications.

Considering the above limitations, let us now look at the third visualization option, the parallel categorical diagram.

Parallel Categorical Diagram

The parallel categorical diagram may be the more adapted visualization solution that we have been looking for.

It allows the dynamic representation of multiple modalities, and the modality position can be switched at any time, guaranteeing the ability to study the modality impacts on the feature in any order.

What’s more, it is very readable as the projection is linear from left to right, and the stream flow representation that grows with the value of importance makes it easy to spot the most important relationship between the modalities and the results.

Here, we use the plotly implementation, but it can also be found in matplotlib and seaborn.

Finally, the proportions for each modality state can be visualized over the rest of the modalities just by interacting with its representation, as shown in the figure above. There is no limitation on the number of projected modalities, nor on the number of different states for each modality. You only need common sense to ensure the readability of the figure!

To sum up, we consider this to be quite a powerful and useful tool depending on the type of features you are studying. We wanted to share this insight with other data aficionados.

We hope you have learned something today. Stay tuned for more tips on data science.

Here are the code snippets of the figures displayed in the article. Enjoy :)


freq_df = df.groupby(
[‘modality_1’, ‘modality_2’, ‘modality_3’, ‘feature_value’]
fig = make_subplots(
rows=1, cols=2,
subplot_titles=(“Modality 1 state 1”, “Modality 1 state 2”)
mod11_feat1_df = freq_df.loc[ (freq_df[‘modality_1’]==’state_1') & (freq_df[‘feature_value’]==0.0)]x = [mod11_feat1_df[‘modality_2’], mod11_feat1_df[‘modality_3’]]fig.add_trace(
go.Bar(x=x, y=mod11_feat1_df[‘Count’], name = ‘name’),
row=1, col=1
mod11_feat2_df = freq_df.loc[(freq_df[‘modality_1’]==’state_1') & (freq_df[‘feature_value’]==1.0)]x = [mod11_feat2_df[‘modality_2’], mod11_feat2_df[‘modality_3’]]fig.add_trace(
go.Bar(x=x, y=m_right_df[‘Count’], name = ‘name’),
row=1, col=2


grouped_df = df.groupby(
[‘modality_1’, ‘modality_2’, ‘feature_value’]
fig,ax = plt.subplots(figsize=(20,15))fig, _ = mosaic(
labelizer=lambda k: grouped_df.loc[k] if k in grouped_df.index else ‘’,
title= “answers accuracy”

Parallel categories:

import plotly.graph_objects as gofig = go.Figure(go.Parcats(
{‘label’: ‘Modality 1’, ‘values’: df.modality_1},
{‘label’: ‘Modality 2’, ‘values’: df.modality_2},



Angéline Plaud, PhD in computer science
Ad Scientiam Stories

I’m a data scientist in Ad Scientiam, building innovative smartphone solutions certified medical devices, providing a better understanding of pathologies.