Review: H-DenseUNet — 2D & 3D DenseUNet for Intra & Inter Slice Features (Biomedical Image Segmentation)
Outperforms U-Net, Ranked the 1st on Lesion Segmentation
In this story, H-DenseUNet, by The Chinese University of Hong Kong (CUHK), is reviewed. H-DenseUNet, hybrid densely connected U-Net,
- consists of a 2D DenseUNet for efficiently extracting intra-slice features and a 3D counterpart for hierarchically aggregating volumetric contexts under the spirit of the auto-context algorithm for liver and tumor segmentation.
- is formulated in an end-to-end manner, where the intra-slice representations and inter-slice features can be jointly optimized through a hybrid feature fusion (HFF) layer.
- ranked the 1st on lesion segmentation, achieved very competitive performance on liver segmentation in the 2017 LiTS Leaderboard, and also achieved the state-of-the-art results on the 3DIRCADb Dataset.
This is the paper in 2018 TMI (Current Impact factor 7.816) with more than 120 citations. (Sik-Ho Tsang @ Medium)
- Network Architecture & Details
- Experimental Results
1.1. 2D DenseUNet
- The above figure (a) shows the pipeline of H-DenseUNet, to be elaborated more later. H-DenseUNet brings the advantages of DenseNet and U-Net together.
- For each image I, it is 224×224×12×1. It is the size of 224×224 with 12 slices and only 1 channel. With n as batch size, n×224×224×12×1.
- This volumetric data I is transformed to three adjacent slices I2d. Specifically, every three adjacent slices along z-axis are stacked together, as shown at the above figure (b).
- For I2d, it is 224×224×3. With n as batch size, it is 12n×224×224×3.
- This I2d acts as input into 2D DenseUNet. 2D DenseUNet-167 is used here which has 167 layers, as shown at the above figure (c).
- The dense block denotes the cascade of several micro-blocks, in which all layers are directly connected.
- To change the size of feature-maps, the transition layer is employed, which consists of a batch normalization layer and a 1×1 convolution layer followed by an average pooling layer.
- The upsampling layer is implemented by the bilinear interpolation, followed by the summation with low-level features (i.e., UNet connections) and a 3×3 convolutional layer.
- Long skip connections are used.
- The output X2d is 12n×224×224×64.
- This 2D H-DenseUNet is trained using 21 hours.
1.2. 3D DenseUNet
- The output feature maps X2d and score maps from 2D DenseUNet are transformed back to the volumetric shape of n×224×224×12×64, X2d’, i.e. size of 224×224 with 12 slices and 64 channels.
- This volumetric shape acts as input into 3D DenseUNet. 3D DenseUNet-65 is used.
1.3. Hybrid Feature Fusion (HFF)
- X3d, the feature volume from layer “upsampling layer 5" in 3D DenseUNet-65, and X2d’ (mentioned in 1.2.) are added together to form Z.
- i.e. Z=X3d + X2d’
- Z denotes the hybrid feature, which refers to the sum of intra-slice and inter-slice features from 2D and 3D network.
- This hybrid feature is jointly learned and optimized in the HFF layer.
- This 3D counterpart of H-DenseUNet cost only 9 hours to converge, which is significantly faster than training the 3D counterpart with original data solely (63 hours).
- Detailed architecture is as follows:
- Weighted cross-entropy function is used as loss function:
- First, 2D-DenseUNet is optimized.
- Then, parameters of 2D-DenseUNet are fixed. Only 3D-DenseUNet and HFF layer are optimized.
- Finally, The whole network is jointly fine-tuned with following combined loss:
- To avoid the holes in the liver, a largest connected component labeling is performed to refine the liver result.
- After that, the final lesion segmentation result is obtained by removing lesions outside the final liver region.
- In the test phase, the total processing time of one subject depends on the number of slices, ranging from 30 seconds to 200 seconds.
2. Experimental Results
2.1. Ablation Study
- Of course, H-DenseUNet performs the best.
2.2. 2017 LiTS Challenge
- There were more than 50 submissions in 2017 ISBI and MICCAI LiTS challenges.
- H-DenseUNet achieved the 1st place among all state-of-the-arts in the lesion segmentation and very competitive result to DeepX for liver segmentation.
- Note that H-DenseUNet surpassed DeepX by a significant margin in the Dice per case evaluation for lesion.
- Also, H-DenseUNet only uses 1 model while DeepX uses multi-model combination strategy.
2.3. 3DIRCADb Dataset
- H-DenseUNet outperforms U-Net with 14.5% improvement on DICE for tumor segmentation.
[2018 TMI] [H-DenseUNet]
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
My Previous Reviews
Image Classification [LeNet] [AlexNet] [Maxout] [NIN] [ZFNet] [VGGNet] [Highway] [SPPNet] [PReLU-Net] [STN] [DeepImage] [SqueezeNet] [GoogLeNet / Inception-v1] [BN-Inception / Inception-v2] [Inception-v3] [Inception-v4] [Xception] [MobileNetV1] [ResNet] [Pre-Activation ResNet] [RiR] [RoR] [Stochastic Depth] [WRN] [ResNet-38] [Shake-Shake] [FractalNet] [Trimps-Soushen] [PolyNet] [ResNeXt] [DenseNet] [PyramidNet] [DRN] [DPN] [Residual Attention Network] [DMRNet / DFN-MR] [IGCNet / IGCV1] [MSDNet] [ShuffleNet V1] [SENet] [NASNet] [MobileNetV2]
Object Detection [OverFeat] [R-CNN] [Fast R-CNN] [Faster R-CNN] [MR-CNN & S-CNN] [DeepID-Net] [CRAFT] [R-FCN] [ION] [MultiPathNet] [NoC] [Hikvision] [GBD-Net / GBD-v1 & GBD-v2] [G-RMI] [TDM] [SSD] [DSSD] [YOLOv1] [YOLOv2 / YOLO9000] [YOLOv3] [FPN] [RetinaNet] [DCN]
Semantic Segmentation [FCN] [DeconvNet] [DeepLabv1 & DeepLabv2] [CRF-RNN] [SegNet] [ParseNet] [DilatedNet] [DRN] [RefineNet] [GCN] [PSPNet] [DeepLabv3] [ResNet-38] [ResNet-DUC-HDC] [LC] [FC-DenseNet] [IDW-CNN] [DIS] [SDN] [DeepLabv3+]
Biomedical Image Segmentation [CUMedVision1] [CUMedVision2 / DCAN] [U-Net] [CFS-FCN] [U-Net+ResNet] [MultiChannel] [V-Net] [3D U-Net] [M²FCN] [SA] [QSA+QNT] [3D U-Net+ResNet] [Cascaded 3D U-Net] [Attention U-Net] [RU-Net & R2U-Net] [VoxResNet] [DenseVoxNet][UNet++] [H-DenseUNet]
Generative Adversarial Network [GAN]