Does Artificial Intelligence suffer of Mood Swings?

Massimo Walter Rivolta
Italian AI Stories
Published in
6 min readOct 27, 2019

Italian AI Stories’ is a series, born from the collaboration of AI for People and the Italian Association for Machine Learning (IAML), that opens the stage to Italian AI researchers. Italian AI Stories allows researchers to share their work and their ideas on how it will impact the society they live in.

Are you an Italian AI researcher and you want to share your work with the public? You can contact Marta Ziosi or Simone Scardapane.

This fifth episode of our series is dedicated to the exploration of how AI can support psychiatrists in understanding neurological disorders.Today we learn about the research of Massimo W. Rivolta. Massimo W. Rivolta is currently a post-doctoral research fellow at the Department of Computer Science of the Università degli Studi di Milano, Milan, Italy. His main research interests are signal processing, feature extraction and machine learning applied to biomedical data. He focuses his scientific research in understanding physiological phenomena and in finding relevant clinical biomarkers.

We definitely do not know whether an Artificial Intelligence (AI) might suffer of mood swings. However, we do know that Machine Learning (ML), a branch of AI, supports psychiatrists in understanding neurological disorders such as Schizophrenia and Bipolar Disorder (BD). In this article, we will focus on the latter.

BD is a severe mental illness affecting around 1–2% of the population [1], characterized by mood swings, with alternating depressive and manic episodes. In this context, Magnetic Resonance Imaging (MRI) has consistently shown that the brain structure is broadly affected in BD patients, specifically with reduction in the grey matter volume, ventricular enlargement, damage to corpus callosum and shrinkage of the cortical thickness in the grey matter. In simpler words, BD changes the structure of certain regions of the brain. However, it is not clear which brain regions are mostly affected, how to measure such changes and what the range of overlap with healthy subjects is.

In [2], my colleagues and I analyzed MRI images collected from forty-one patients affected by BD and thirty-four healthy subjects, both enrolled in a hospital in northern Italy. The main goal was to identify the most frequent brain regions affected by BD that could optimally differentiate from healthy subjects. The goal was very challenging and complex for at least three reasons. First, the brain is composed by many regions according to their position, morphology and functionality. Indeed, depending on the definition of “brain region”, there exist multiple brain atlases. Which atlas to use was therefore our first problem. Second, the number of possible features extractable from MRI images is countless. In this case, however, having a limited sample size and data collected from only one hospital, it was not convenient a “try’em all” strategy because we could simply fall in the trap of overfitting (i.e., selecting a feature that works very well in this population but very poorly on another one). The selection of which features to extract was then our second problem. Third, the diagnosis of BD is usually performed using specific criteria from the American Psychiatric Association’s Diagnostic and Statistical Manual of Mental Disorders. Unfortunately, the diagnosis performed with the manual might suffer of inter-rater variability. In simpler words, the diagnosis might be uncertain, and, consequently, the ML methodologies must handle such issue properly.

The first problem had a simple solution. Indeed, considering the limited sample size, we opted to use a brain atlas that was commonly adopted in this context and could help to compare the results with those published by other researchers. We chose to segment the brain regions from the MRI images using the Desikan atlas [3], by means of the FreeSurfer software (http://surfer.nmr.mgh.harvard.edu/). In total, we obtained 58 segmented regions, 29 for each hemisphere.

Figure 1: Sketch of the cortical thickness extracted by the FreeSurfer Algorithm.

Regarding the second problem, it is well known that the average cortical thickness shrinks within certain brain regions in BD, and, as a good starting point, we extracted it from each of the 58 brain regions (Fig. 1). However, we also wanted to understand whether other characteristics of the brain cortex, rather than just the average, might be of help for differentiating between healthy subjects and BD patients. We decided to extract the skewness of the brain cortical thickness in addition to the average. In this way, we had 116 features extracted for each subject.

Figure 2: Example on how close the proposed ML algorithm might consider the subjects.

The third problem was very challenging since most of the ML algorithms validated with medical data requires precise diagnosis. In addition, many ML algorithms do not rely on the assumption that subjects with a certain diagnosis must be similar to each other in terms of the extracted features. In other words, features of BD patients must be similar to each other and, at the same time, be far from those of healthy subjects. Although this characteristic might be controversial, it is a common feature searched in the medical field. In order to account for both issues, we decided to apply a graph-based semi-supervised method [4] in combination with a distance metric learning based on the large margin neighbor [5] (Fig. 2). To make it simple, the method requires only a small portion of samples (in this case, subjects) to be correctly labelled (diagnosed) and then propagates the labels to other samples based on the distance measured between features belonging to pairs of subjects. The distance metric learning instead enhances the similarity within the same diagnosis group and enlarges the distance between different ones.

Using the selected methodologies, we performed a feature selection (Greedy Forward approach) by maximizing the validation accuracy with a pool of maximum 5 features among the 116 (we limited the maximum amount to further limit overfitting). We found that most of the brain regions selected were already known to be involved in BD patients, proving that the proposed ML pipeline was relevant and likely avoided overfitting on the small population. In addition, another important result was that the skewness of the cortical thickness was found as a powerful feature for the diagnosis of BD, as much as the average. Although it requires further investigations, in case the skewness will be found relevant in other populations as well, it may mean that BD shrinks the cortex in a non-uniform way, for example, as one may imagine at the initial stage of the disorder.

To conclude, a proper selection of a ML algorithm in combination with knowledge on the medical domain helped to shed light on how BD affects the structure of the human brain.

The understanding of what might cause BD or what BD affects is of fundamental importance to counterattack the disorder, to improve the quality of life of the patients and their close relatives, or, even better, to prevent it. Although it is still a long way to get a full characterization of BD and its effects, research teams worldwide (including ours) have been spending time and resources to achieve such an important goal.

References:
[1] Oswald P., Souery D., Kasper S., Lecrubier Y., Montgomery S., Wyckaert S., Zohar J., Mendlewicz, J., Current issues in bipolar disorder: a critical review, Eur Neuropsychopharmacol 2007;17:687–695.
[2] Squarcina L., Dagnew T. M., Rivolta M. W., Bellani M., Sassi R., Brambilla P., Automated cortical thickness and skewness feature selection in bipolar disorder using a semi-supervised learning method, J Affect Disord 2019;256:416–423.
[3] Desikan R. S., Ségonne F., Fischl B., Quinn B. T., Dickerson B. C., Blacker D., Buckner R. L., Dale A. M., Maguire R. P., Hyman B. T., Albert M. S., Killiany R. J., An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest, Neuroimage 2006;31:968–980.
[4] Erdem A., Pelillo M., Graph transduction as a noncooperative game, Neural Comput 2012;24:700–723.
[5] Weinberger K. Q., Saul L. K., Distance metric learning for large margin nearest neighbor classification, J Mach Learn Res 2009;10:207–244.

--

--