A Bayesian approach to unsupervised segmentation

8 min readMay 25, 2022

Application in cerebral oncology

Context

Within the framework of the EpiBrainRad project, Quantmetry brought its computer vision expertise to IRSN in order to contribute to the study of the effects of radiotherapy on patients with brain tumors. More precisely, we have proposed a data pipeline allowing to quantify the evolution of several biomarkers over time on the basis of magnetic resonance images (MRI). The 135 patients of a cohort built in the framework of a collaborative project led by IRSN are submitted to a longitudinal study. These patients suffer from high-grade gliomas, which reduce their life expectancy to a few years at most and severely degrade their quality of life. Treatment consists of tumor removal followed by targeted radiation therapy, which should slow the worsening of symptoms. The input data for this study are the radiological dose emissions to which they are subjected, depending on various covariates such as mental effects. This radiotherapy has consequent side effects, in particular cerebral atrophy and leukoencephalopathy which is a degradation of the white matter (the brain is mainly composed of two types of tissues, the grey matter and the white matter). The study should allow to better quantify them. We are focusing on measurements of brain volume and volume affected by leukoencephalopathy over time. They should feed a dose-response study conducted by the IRSN.

Aims

Automated brain MRI processing appears as a requirement for such a study, to diminish the (currently very consuming) processing time dedicated to compute manually such indicators on a 3D-reconstruction of brains. Human processing of 2D MRI typically requires both strong availability of physicians (radiologist, neurologists…) and building a 3D representation of responses appears as an impossible task for a human. Results obtained in the last decades about the automated treatment of brain images give hope that, in the very context of this study, such a task can be achieved. For this reason, in this situation of pure unsupervised learning, this work describes an applied attempt to produce such results, at the current state of the art. It strongly uses Bayesian techniques to highlight the hidden brain structure.

Data

The dataset covers 168 patients with leukoencephalopathy. Among those patients and their different exams, we observed a wide variety of MRI modalities (the main ones are T1, T2 and FLAIR) and acquisition parameters. The modalities of MRI differ from the definition of (dark) hyposignal. FLAIR is a kind of T2 where signals originating from cerebrospinal fluid have been suppressed. Among these numerous modalities we have identified the FLAIR MRI to be optimal for automatic segmentation.

Raw MRIs typically suffer from bias and many other defects, mainly due to experimental heterogeneities, and to our knowledge there is no unique formalized approach to correct such data. Besides, among our constraints no data were labelled or annotated.

Brain extraction

Before proceeding with the segmentation of brain tissues, it is necessary to isolate the regions of the head that we intend to study. The region of interest is composed exclusively of brain tissues, however in an MRI sequence of the brain, other tissues are apparent: muscles, fat, eyes… All these elements do not provide us with any additional information for study and may also be a source of noise. We need to discard them.

Original image : toy example (left) and real case (right)

Otsu thresholding: This operation divides the voxels into two classes using an intensity threshold maximizing the intra-class variance while minimizing the intra-class variance. This step requires the brain to be significantly brighter than the rest of the head, as it is the case for the FLAIR sequences we considered.

Morphological opening: This is a standard operation allowing to remove voxels which are not in the bulk of the set. We use a ball as a footprint with a radius of 2mm, which means that we remove details which have a characteristic length smaller than 2mm.

Largest connected part: This operation uses the fact that brains are made of one part. Note that thanks to the opening operation the bulk of the brain should not be connected to other bright areas such as some parts of the skull, so that these voxels are dropped.

Closing: Symmetrically to the opening, this morphological operation allows to include into the extracted set any voxel surrounded by voxels already labelled as part of the brain. The brain furrows, which contain the cerebrospinal fluid, are included at this stage.

Filling holes: We conclude the extraction algorithm by filling the ventricles. To this end we detect the bulk of the brain using a Gaussian blur and include this part.

Brain segmentation

We turned to an unsupervised Gaussian segmentation method to split the brains voxels into 4 classes (1: cerebrospinal fluid, 2: grey matter, 3: white matter, 4: leukopathy). The Bayes formula expresses that the posterior probability of a given segmentation X is the result of two terms: a likelihood measuring the matching between an intensity level and a class, and a priori measuring the spatial consistency of a segmentation.

Likelihood: Gaussian mixture

We assume that the voxels of a given class k (1, 2, 3 or 4) follow a Gaussian distribution defined by a mean μ_k and a variance σ²_k , which are hidden variables. The likelihood term relies solely on the intensity distributions, not taking spatial consistency into account. Denoting by

the unknown parameters, the likelihood takes the following form :

Prior: Hidden random Markov field model

The prior term models the a priori knowledge that tissues form coherent domains and not sparse voxels. The prior probability is a decreasing function of the area of contact between the different classes. The simplest choice is to make the log-prior log π(X) proportional to the area of the frontier.

We can also set a different weight v(k, k′) for each pair of tissues, in order to penalize more certain types of interfaces. For example white matter should not be in contact with cerebrospinal fluid, so that a large penalty should be considered.

Sampling algorithm

Once we define the posterior we can compute statistics of interest such as the mean volume occupied by each tissue, as well as a credence intervalle. The mean writes as an integral in high dimension and is computed using a sampling method. We draw a large number N of labelings X of the brain according to the posterior distribution and we compute the mean volume over those samples by counting voxels s of class k:

and we estimate the credence intervalle using the following variance estimator:

In the studied model the parameter set θ defining the gaussian intensity mixture is unknown, but we can overcome this issue using a Gibbs sampler. It samples alternately the variables θ and X starting from an initial guess:

The prior on the posterior π(θ) is made of Gaussian laws for the μ and inverse Gamma laws for the σ², encoding a priori knowledge on the different tissue intensities. The means and variances of these Gaussian distributions are called hyperparameters and are set using expert knowledge.

In terms of qualitative results, we can see below that we obtain a reasonable segmentation (1: cerebrospinal fluid, 2: grey matter, 3: white matter, 4: leukopathy). However we encounter robustness issues on low quality images, and a sensitivity to intensity biases (here to top is darker than the bottom).

Combining T1 and FLAIR measures

The main difficulty of the task comes from the low contrast between the tissues in both the FLAIR and the T1 images.

In order to improve the separability of the different tissues we registered the T1 IRMs onto the T1 ones in order to characterize each voxel with two intensities instead of one. The 2D histogram shows a clear improvement, which indicates the interest of using both channels simultaneously.

This approach is however more cumbersome: first the registration step can prove unstable and is time consuming, second this requires both images to be clean. In this case using more data sources brings more performance but is less robust to variability in data quality.

Conclusion

In this article we presented a statistical method for unsupervised segmentation. We recap that two types of hyperparameters play a crucial part in the model and they should be dictated by expert knowledge. The first is the parameter θ (or v(k, k′) ) for the label prior π(X) , and the second is the set of Gaussian laws for the prior π(θ).

Quantmetry

Pioneer and independent since its creation in 2011, Quantmetry is the French pure player consulting firm in artificial intelligence. Driven by the desire to offer superior data governance and state of the art artificial intelligence solutions.