Monai Explores Learning-Based Medical Image Registration

MONAI has been working closely with DeepReg on learning-based medical image registration using PyTorch. In the latest release, MONAI v0.5.0, we are delighted to provide a set of essential tools for developing registration pipelines.

Medical Image Registration

Image registration, a process that spatially aligns one image with another, is important to many medical imaging applications. The two images that register are often referred to as moving and fixed images. An image registration algorithm generates a spatial transformation that can transform, or “warp”, the moving image onto the fixed image coordinates.

The ability to align two or more images together allows the fusion of complementary information acquired from different imaging modalities (multimodal registration, e.g., MR and ultrasound), at different time points (longitudinal registration), or from different patients (inter-patient registration). For example, tracking tumour growth from a series of MR scans or efficacy assessment by comparing two CT scans acquired before and after treatment.

We’ve been working closely with the DeepReg team to implement the basic building blocks for Medical Image Registration, leveraging their experience in implementing these features in DeepReg using Tensorflow and helping port them over to MONAI using PyTorch. Below you’ll find all of the components built in the latest v0.5.0 release of MONAI for Image Registration. These components continue to follow the MONAI paradigm of being easily compositional and portable and allow for easy integration into existing PyTorch workflows.

MONAI Components for Image Registration

Differentiable Warping

MONAI v0.5 supports mapping the moving image coordinates to the fixed image coordinates using a spatial transformation with the Warping module. The spatial transformation can be a constrained parametric affine transformation (permitting rotation, translation, and scaling) or a dense displacement field (DDF). The DDF is a general form representing a transformation by a set of vectors that define the displacements at individual pixel/voxel locations, which can be used to warp the moving image.

In addition, MONAI v0.5 provides the DVF2DDF module to numerically integrate a dense (static) velocity field (DVF) to a diffeomorphic DDF, which constrains the resulting DDF to a special class of transformation desirable for some medical applications.

ConvNet as a Spatial Transformation Estimator

Along with the DVF2DDF and Warp modules, MONAI v0.5 has implemented three commonly adopted networks for image registration, RegUNet, LocalNet, and GlobalNet. RegUNet and LocalNet have an encoder-decoder architecture to predict the general non-rigid transformation, either in DDFs or DVFs, while GlobalNet is an encoder to predict the parameters of the affine transformation.

Training with Stochastic Gradient Descent

Adopting a similar framework as in DeepReg, MONAI v0.5 supports both unsupervised and weakly-supervised training of registration networks.

The unsupervised training is driven by image similarity losses, such as LocalNormalizedCrossCorrelationLoss and GlobalMutualInformationLoss. When segmentation labels are available for corresponding anatomical or other interesting regions, weak supervision can be enabled by using one of the overlap measures, such as DiceLoss or DiceCELoss. Released together with MONAI v0.5 is a demo showing how to align lung CT scans and combine these two training losses.

Also introduced in MONAI v0.5 is a new loss function called BendingEnergyLoss, which helps regularize registration network training. This is an example of deformation regularizers that enforce smoothness of the predicted DDFs and are often helpful for unsupervised and weakly-supervised applications.

End-to-end Registration Examples

2D affine image registration

The 2D affine image registration tutorial illustrates a workflow of estimating affine transformation parameters between a pair of moving and fixed 2D images. To keep it lightweight and focus on demonstrating the core concepts, the “X-Ray hands” images from the MedNIST dataset are sampled before being randomly transformed to form the moving and fixed image pairs for training and testing.

In this workflow, pairs of gray-scale, intensity-normalised images are stacked at the channel dimension as the input to a GlobalNet. The GlobalNet then learns to estimate affine transformation parameters by optimising the mean square errors between the fixed image and the warped moving image — a case of unsupervised learning for affine registration.

In real-world applications, similar affine registration workflows have been widely used to estimate an initial global alignment efficiently. The initial registration step is then followed by or combined with a more refined, local, and non-rigid registration, such as those estimating DDFs or DVFs.

3D intra-subject lung CT registration

The 3D intra-subject lung CT registration tutorial is an example of registration between 3D lung CT images acquired at inspiration and expiration from a single patient. This type of intra-subject registration is helpful in tracking anatomical features of interest like airways in airflow analysis or compensating motion during radiotherapy.

The tutorial showcases several features described above, including unsupervised and weakly-supervised losses, deformation regulariser, non-rigid transformation based on DDFs, and 3D volumetric registration using real clinical images. To learn more, check out the provided notebook and the original DeepReg demo using the same open-accessible data set.


Medical image registration using deep learning is an interesting and active area of research. With the basic building blocks now implemented in PyTorch using MONAI, we will continue to strengthen the relevant software capability by leveraging the latest research outcome and experience from other open-source projects in this area, including DeepReg and VoxelMorph. We are also highly welcome to suggestions and other contributions from the wider community.

About the Contributors

Yiwen Li (University of Oxford)

Yunguang Fu (University College London; InstaDeep)

Yipeng Hu (University College London)


An open source machine learning framework that accelerates…