Architectures for Medical Image Segmentation [Part 1: UNet]

Shambhavi Malik
CodeX
Published in
2 min readJun 15, 2021
Photo by Uriel SC on Unsplash

Medical image segmentation is an important area in medical image analysis and is necessary for diagnosis, monitoring, and treatment. The deep learning-based methods have achieved superior performance compared to traditional methods in medical image segmentation tasks. I'm going to go over UNet and several other architectures that are derived from UNet and are used for medical image segmentation.

Basic UNet

Any UNet has 2 parts:

  1. Contracting Path [Encoder] (left side)
  2. Expansive Path [Decoder](right side).

The encoder/contracting path follows the typical architecture of a convolutional network. It extracts feature maps from the image. It has repeated application of two 3x3 convolutions (unpadded convolutions), each followed by a rectified linear unit (ReLU) and a 2x2 max pooling operation with stride 2 for downsampling. At each downsampling step, the number of feature channels is doubled and spatial dimension is halved.

Every step in the decoder/expansive path consists of an upsampling of the feature map followed by a 2x2 convolution (“up-convolution”) that halves the number of feature channels.

UNet Architecture

A concatenation of the correspondingly cropped feature map from the contracting path, and two 3x3 convolutions, each followed by a ReLU exists. The cropping is necessary due to the loss of border pixels in every convolution. At the final layer, a 1x1 convolution is used to map each 64- component feature vector to the desired number of classes. In total the network has 23 convolutional layers.

3D UNet

3D UNet is an augmentation of the basic U-net framework that enables 3D volumetric segmentation. Its core architecture is the same as the basic UNet only with all the 2D operations are replaced by their 3D counterparts. Basically:

  • 2D Conv to 3D Conv
  • MaxPool 2D to MaxPool 3D
  • 2D UpConv to 3D UpConv

This results in a 3-dimensional framework that can now be used for 3D volumetric segmentation. It's widely used in volumetric CT and MR image
segmentation applications, including diagnosis of the cardiac structures, bone structures, vertebral column, brain tumors, liver tumors, lung nodules, nasopharyngeal cancer, multi-organ segmentation, head and neck organ at risk assessment, and white matter tracts segmentation.

In the next article of this series, I will go over Attention UNet.

References

Siddique, Nahian, et al. “U-Net and its variants for medical image segmentation: theory and applications.” arXiv preprint arXiv:2011.01118 (2020).

Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. “U-net: Convolutional networks for biomedical image segmentation.” International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.

--

--