Catalyst.Neuro: A 3D Brain Segmentation Pipeline for MRI

Catalyst Team
Published in
6 min readJul 2, 2021

Authors: Kevin Wang, Alex Fedorov, Sergey Plis, Sergey Kolesnikov.

Catalyst and TReNDS have been working together on applications of deep learning for neuroimaging and brain dynamics. A recent product of this collaboration is Catalyst.Neuro, which reimplements the brain segmentation pipeline in End-to-end learning of brain tissue segmentation from imperfect labeling and An (almost) instant brain atlas segmentation for large-scale studies. This post describes the fundamental concepts implemented in Catalyst.Neuro and introduces different deep learning models comparison on brain segmentation task.

Note. Catalyst is a PyTorch framework for Deep Learning research and development. You get a training loop with metrics, model checkpointing, advanced logging and distributed training support without the boilerplate. Introduction to the Catalyst can be found here.

Brain Segmentation

Structural magnetic resonance imaging (sMRI) is a non-invasive technique for examining the anatomy and the pathology of the brain. Segmenting a structural MRI into tissue types or functional regions is an important processing step that enables subsequent inferences about tissue changes in development, aging, and disease. Schizophrenia, multiple sclerosis, and dementia are just a few of the psychiatric conditions associated with abnormal degeneration of brain regions that reflect in their volume changes relative to healthy control subjects of similar age and gender.

Deep Learning for Brain Segmentation

Brain segmentation has previously been accomplished with a pipeline of iterative statistical methods including Markov Random Fields (FreeSurfer). However, it takes hours to process a single brain and can output different results based on different initializations. With recent advances in deep learning, medical imaging has seen a large number of applications for classification (benign vs. malignant tumors), regression (biological age estimation), and segmentation (nuclei segmentation). The current standard medical segmentation model (U-Net) is a convolutional neural network architecture consisting of an encoder and decoder.

Brain Segmentation with Catalyst.Neuro

Catalyst.Neuro implements a brain segmentation pipeline using the Mindboggle dataset to compare U-Net with the MeshNet (Dilated 3D CNN) architecture. With minimal preprocessing, MeshNet performs inference up to 9x faster and is >300x smaller while maintaining a competitive DICE score vs. a U-Net baseline. Interactive tutorials are available via Google Colab for training and inference and are located here.


Mindboggle MRIs and Manual Labels

Expert manual labeling is the gold standard for labeling brain segments for MRI’s. Complete labeling of a single MRI scan can take up to 2–3 days for an expert. Manual labeling is also prone to inconsistency resulting in intra and inter-observer variability.

Manual segmentation of multiple observers of a colorectal liver metastasis on an axial slice of a CT scan.

We use Mindboggle to demonstrate our MRI segmentation pipeline. Mindboggle is the largest and most complete set of free, publicly accessible, manually labeled human brain images.

You can download Mindboggle MRI files and labels from the Open Science Framework using osfclient below:

osf -p 9ahyp clone Mindboggle_data
cp -r Mindboggle_data/osfstorage/Mindboggle101_volumes/ data/Mindboggle_data

Mindboggle developers also have some impressive visualization projects including where you can visualize a scan in 3d and get associated volumetric statistics.

Human Connectome Project MRIs and FreeSurfer Labels

Due to the previously mentioned time and expense of manual labeling, the de facto standard for brain segmentation is automated tools. The most popular tool is FreeSurfer, which is based on a number of iterative traditional statistical methods and Markov Random Fields (as previously mentioned). Freesurfer has ~ 91% DICE overlap with manual labels and takes hours to process a single brain. We use the FreeSurfer processed data from the Human Connectome Project(HCP) to train our pretrained atlas and gray-white matter models. The HCP dataset is an open-access dataset with multi-modal brain imaging data for healthy young-adult subjects.


Minimal preprocessing and data augmentation is required to train our brain segmentation models.

  1. For the example Mindboggle pipeline, we omitted the image normalization, but for our pretrained models, we normalize images to 1x1x1 mm thickness. Below is the command to do this using mri_convert from FreeSurfer:mri_convert *brainDir*/t1.nii *brainDir*/T1.nii.gz -c
  2. Zero padding is then applied to get the volume dimensions of 256x256x256.
  3. Finally, min-max normalization is applied to the volume.


In order to fit the model into the memory of a single GPU, we subsample 38x38x38 inputs from the 256x256x256 MRI volume. We sample from a Gaussian distribution with a mean set at the center of the volume and a diagonal covariance of 50. This is a simple trick way to avoid oversampling the background class since the center of the volume is often the center of the brain.

U-Net vs. MeshNet


U-Net is a convolutional neural network architecture consisting of an encoder and decoder designed for segmentation problems. The encoder shrinks image information to compact representations and the decoder then restores the output to the size of the input. Connections between the encoder and the decoder allow for the flow of higher resolution features. When compared to MeshNet, U-Net has drawbacks in terms of speed and size.


MeshNet is a simple convolutional neural network architecture that takes advantage of dilated kernels to extend the receptive field of the network without a corresponding increase in parameters. smaller vs. a U-Net baseline


For a single brain volume, we perform the following steps for inference depending on the number of subvolumes used for inference. We recommend at least 512 subvolumes as there is a positive relationship between the number of subvolumes and the DICE segmentation score.

  1. Sample all non-overlapping subsamples for the full MRI volume
  2. Sample from a Gaussian distribution with a mean set at the center of the volume and a diagonal covariance of 50 until the total number of subsamples is met.
  3. Predict for all subvolumes
  4. Stitch together a final predicted volume
  5. Finally, the majority vote of the subvolume predictions determines the class for every voxel

This inference process is illustrated below.

Quantitative Evaluation

MeshNet maintains a competitive DICE score vs. U-Net on the Grey-Matter White Matter labeled datasets as well as the Mindboggle datasets trained with the same number of brains. It also performs inference up to 9x faster and while being >300x smaller.

MeshNet vs U-Net DICE
MeshNet vs. U-Net Performance

Brain Segmentation Examples

Below are some example MRI segmentations. If you want to submit a normalized T1 image for brain segmentation then go to brainchop.

Input Slice

U-Net Gray Matter White Matter Segmentation

MeshNet Gray Matter White Matter Segmentation

MeshNet with Dropout obtains a lower DICE score due to incorrectly segmenting the skull and the neck areas. However, it looks like the ventricles and some other structures improve. The neck can easily be removed via post-processing so, in practice, the lower DICE score would not be a problem.

MeshNet Atlas Segmentation


In this post, we describe the fundamental concepts implemented in Catalyst.Neuro and introduce different deep learning models comparison on brain segmentation task. If you’re interested in exploring the models interactively, tutorials are available via Google Colab for training and inference. If you’re interested in Catalyst feel free to join the community slack. If you’re interested in TReNDS feel free to write for collaboration. Thank you!