Overview of GAN Use in The Medical Field

Aysen Çeliktaş
9 min readFeb 1, 2024

--

GAN (Generative Adversarial Network) can form the backbone of studies that provide clinical benefits in data augmentation, combining different modalities, image reconstruction, segmentation, or automatic visualizations in medical imaging. In my previous articles, “Supervised Machine Learning for Beginners”, “Unsupervised Machine Learning for Beginners” and “Deep Learning for Beginners”, briefly talked about machine learning. Here, I wanted to touch on the subject of GAN, which is a deep learning architecture.

[from Canva]

Medical imaging is essential for the diagnosis and detection of diseases. In addition, artificial intelligence studies on these images can positively impact developments in the field of medicine. However, using large data sets is crucial for artificial intelligence studies. Especially in supervised learning methods, a training set containing a large number of labeled examples is needed. The larger the training set, the more the system will learn and the more accurate its decisions will be [1]. The popularity of GAN and GAN-based networks is increasing to solve the problem of large data sets in medical images. Synthetic medical images produced with these networks will increase the reliability of diagnosis and treatment processes by allowing for data augmentation [2].

In addition, each medical image obtained provides different advantages in the examinations for diagnosis and treatment. While hard tissues can be observed more easily in CT images, soft tissues can be observed more easily with the outputs obtained from MR images. In fact, for each medical imaging device, there are advantages and disadvantages of the images obtained from different sequences within them. When this situation is desired to be exemplified through MR images; images obtained from Inversion Recovery (IR) sequences provide better contrasted images in terms of T1 properties than Spin Echo sequences. In this case, as multi-modal medical imaging obtained from different imaging techniques, where different features are suppressed in the images for risky structures, the method of examining the images together is a frequently used examination method. Different images complement each other where they are lacking. However, it may not be appropriate to request this from every patient in terms of both time and cost. Based on this, studies are being conducted with GAN-based network structures in which different modalities are combined or transformation from one modality to another is achieved [3].

Obtaining good image quality in medical images is important to ensure that subtle, high-risk structures are not overlooked, but improving image quality may require higher radiation doses for some medical devices (such as CT scans). This can lead to the development of conditions such as cataracts or cancer in patients. Additionally, regardless of radiation dose, patient motion is the primary source of noise in devices such as MRIs. The RF emissions resulting from these movements contribute to noise in MRI images [4].

The Generative Adversarial Network, is used in expanding image datasets, obtaining high-resolution images through image reconstruction processes, transferring textures/patterns from one image to another, synthesizing multiple modalities, and segmentation. In this article, some studies conducted with GANs in the field of medical imaging are listed below.

GAN is an artificial intelligence algorithm belonging to the unsupervised machine learning class. It was discovered by Google researcher Ian Goodfellow in 2014. GAN is fundamentally built upon two structures called the generator and discriminator. The generator is the part that creates meaningful data from meaningless input. The images obtained from here are sent to the discriminator. While the discriminator is trained with the images it is intended to output at the end of the system, it is also fed with the outputs from the generator. Based on whether the incoming output is desired or not, it classifies it with a value between 0 and 1. Accordingly, it provides feedback to the generator, and the system operates with a dual feedback loop. This system is visualized in Figure 1. Its advantages include not requiring a Markov chain and the flexibility of the model. The disadvantage is that if the generator and discriminator are not well synchronized, the model may lose its meaning.

Fig.1. General working principle of GAN [Figure on “A Beginner’s Guide to Generative Adversarial Networks” [6]]

In 2014, in the study conducted by Goodfellow and colleagues, a series of datasets including MNIST, Toronto Face Database (TFD), and CIFAR-10 were used. Rectifier linear activations and sigmoid activations were used for the generator networks, and a Gauss Parzen window was placed on the generated samples, with the log-probability reported under this distribution. The discriminator network used maxout activation and dropout was applied during training. The resulting example of this study is shown in Figure 2. [5].

Fig.2. CIFAR-10 (convolutional discriminator and “deconvolutional” generator). Visualization of samples from the model. The rightmost column shows the nearest training example to a given test example to demonstrate that the model has not memorized the training set. [Figure on “Generative adversarial nets.” [5]].

Increasing Data Set by Generating Synthetic Medical Images

GAN-based synthetic brain PET image generation[2]: Alzheimer’s disease is a disease that causes degeneration of brain cells. This article will help to create synthetic PET images, thereby expanding the datasets for artificial intelligence systems that could assist in the early diagnosis of this disease.

Fig.3. “Real and synthetic brain PET images of MCI patient: (a) real (b) synthetic” [Figure on “GAN-based synthetic brain PET image generation.” [2]]

A new VAE-GAN model to synthesize arterial spin labeling images from structural MRI (VAE GAN)[7]: Arterial spin labeling (ASL) eksojen izleyicilere ihtiyaç duymaz ve radyason içermez. [40]. Dolayısıyla, bu görüntüleme tekniği demans hastalıklarının teşhisinde kullanım için uygundur. Bu çalışmada, ASL görüntüleri sentezlemek için variational auto-encoder (VAE) ile GAN mimarisi birlikte kullanılmıştır.

Fig.4. “The visualization of examples of (a) synthesized ASL images obtained by new VAE GAN (b) synthesized ASL images obtained by CycleGAN © synthesized ASL images obtained by LSGAN (d) synthesized ASL images obtained by WGAN-GP” [Figure on “A new VAE-GAN model to synthesize arterial spin labeling images from structural MRI.” [7]]

Semi-supervised GAN-based Radiomics Model for Data Augmentation in Breast Ultrasound Mass Classification (Semi-supervised GAN)[8]: The aim of this article is to develop a radiomic model based on the Semi-supervised GAN method to perform data augmentation in breast ultrasound images. In the architecture studied here, the TGAN architecture was first trained using the dataset. The resulting synthetic data was combined with real data and used in a CNN model for breast cancer classification. As a result, it was shown that the semi-supervised GAN method adopting the TGAN architecture can generate high-quality breast ultrasound mass images that can augment the training dataset.

Fig.5. “(a) benign and (b) malignant breast masses obtained by this method” [Figure on “Semi-supervised GAN-based radiomics model for data augmentation in breast ultrasound mass classification.” [8]]

Studies on Different Modalities Combinations or Transformations

DiCyc: GAN-based deformation invariant cross-domain information fusion for medical image synthesis (DiCyc)[9]: Well-aligned data sets are required for mapping the geometric similarities between CT and MR. The cycle-consistent generative adversarial network (CycleGAN), one of the popular GAN models used in medical image synthesis, is commonly used for synthesizing medical images from different modalities. However, CycleGANs are not good enough in aligning the images. In this paper, the deformation-invariant cycle consistency model (DiCyc) is investigated to ensure that deformations of medical images are properly aligned and no information loss occurs when synthesizing with other modalities.

Fig.6. “By applying an arbitrary deformation to the T2-weighted images, a synthesized proton density (PD)-weighted image with source alignment was created.” [Figure on “DiCyc: GAN-based deformation invariant cross-domain information fusion for medical image synthesis.” [9]]

CT synthesis from MRI using multi-cycle GAN for head-and-neck radiation therapy (Multi-Cycle GAN with Z-Net)[10]: CT scans leave potential harmful effects on patients due to continuous exposure to additional ionizing radiation. In contrast, in MRI scans, patients are not exposed to radiation, making Radiation Therapy guided by MRI (MRIgRT) safer for patients. Additionally, it is more cost-effective in terms of imaging. However, MRI cannot obtain the patient’s bone information. Therefore, obtaining CT images from MRI images has become an important issue. In this study, CT images were synthesized from MRI images using the Multi-Cycle GAN architecture. The Z-Net generator and the Multi-Cycle GAN-based method performed better than Cycle GAN.

Fig.7. “The details of the (a) maxillary sinus (b) skull” [Figure on “CT synthesis from MRI using multi-cycle GAN for head-and-neck radiation therapy.” [10]]

Study for Resolution Improvement

Boosting Magnetic Resonance Image Denoising With Generative Adversarial Networks (conditional GAN)[11]: This article proposes a method using conditional Generative Adversarial Networks for MRI noise reduction. A convolutional encoder-decoder network-based generator is used to minimize the noise to the maximum extent. The proposed method is found to outperform existing methods in both noise reduction and preservation of robust anatomical structures and defined contrast. This method is considered appropriate for preserving sharp edges in MR images.

Fig.8. “Results of the proposed method on the real clinical images.” [Figure on “Boosting magnetic resonance image denoising with generative adversarial networks.” [11]]

Study for Segmentation

Automatic Multi-Organ Segmentation in Thorax CT Images Using U-Net-GAN (U-Net GAN)[12]: The atlas-based method is the most commonly used approach for automatic segmentation. However, this atlas may not always provide accurate results due to the variable anatomy of patients. The aim of this study is to develop a deep learning-based method to automatically segment multiple thoracic OARs in chest computed tomography (CT) for radiotherapy treatment planning. By using U-Net as the generator and FCN as the discriminator, superior segmentation accuracy was achieved compared to existing methods. To the best of our knowledge, the proposed method is the first thoracic CT automatic segmentation method using GAN technique.

Fig.9. “(a) manual shaping (b) 3D visualization obtained by the proposed method” [Figure on “Automatic multiorgan segmentation in thorax CT images using U‐net‐GAN.” [12]]

Study for Volume Rendering

A Generative Model for Volume Rendering[13]: Volume rendering is a visualization system that creates image simulations using three-dimensional scalar fields. In this study, a GAN-based deep learning system was used to synthesize volume rendering images based on specific features. The goal of GAN is to train on a large collection of images and to create a single volumetric dataset with TF field (both color and opacity) given, synthesizing new images at a resolution of 256×256 pixels.

Fig.10. “We show qualitative results comparing synthesized images to ground truth volume renderings produced without illumination. The bottom row shows typical artifacts, such as incorrect color mapping and lack of detail preservation.” [Figure on “A generative model for volume rendering.” [13]]

References

[1] Cunniff C, Byrne JL, Hudgins LM, Moeschler JB, et al. “Informed consent for medical photographs.” Genetics in Medicine 2.6 (2000): 353–355.

[2] Islam J, Zhang Y. “GAN-based synthetic brain PET image generation.” Brain informatics 7 (2020): 1–12.

[3] Cao B, Zhang H, Wang N, Gao X, Shen D. “Auto-GAN: self-supervised collaborative learning for medical image synthesis.” Proceedings of the AAAI conference on artificial intelligence. Vol. 34. №07. 2020.

[4] Xiao Y, Peters KR, Fox WC, Rees JH, Rajderkar DA, Arreola MM, et al. “Transfer-gan: multimodal CT image super-resolution via transfer generative adversarial networks.” 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI). IEEE, 2020.

[5] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).

[6] DevHunter. A Beginner’s Guide to Generative Adversarial Networks (GAN) [cited 2024 31 Jan]. Available from: https://devhunteryz.wordpress.com/2018/09/01/uretken-dusman-aglarigan-icin-yeni-baslayanlar-kilavuzu/

[7] Li F, Huang W, Luo M, Zhang P, Zha Y. “A new VAE-GAN model to synthesize arterial spin labeling images from structural MRI.” Displays 70 (2021): 102079.

[8] Pang T, Wong J. H D, Ng WL, Chan CS. “Semi-supervised GAN-based radiomics model for data augmentation in breast ultrasound mass classification.” Computer Methods and Programs in Biomedicine 203 (2021): 106018.

[9] Wang C, Yang G, Papanastasiou G, Tsaftaris SA, Newby DE, Gray C, MacGillivray TJ, et al. “DiCyc: GAN-based deformation invariant cross-domain information fusion for medical image synthesis.” Information Fusion 67 (2021): 147–160.

[10] Liu Y, Chen A, Shi H, Huang S, Zheng W, Liu Z, Yang X, et al. “CT synthesis from MRI using multi-cycle GAN for head-and-neck radiation therapy.” Computerized medical imaging and graphics 91 (2021): 101953.

[11] Tian M, Song K. “Boosting magnetic resonance image denoising with generative adversarial networks.” IEEE Access 9 (2021): 62266–62275.

[12] Dong X, Lei Y, Wang T, Thomas M, Tang L, Curran WJ, Yang X. “Automatic multiorgan segmentation in thorax CT images using U‐net‐GAN.” Medical physics 46.5 (2019): 2157–2168.

[13] Berger M, Li J, Levine JA. “A generative model for volume rendering.” IEEE transactions on visualization and computer graphics 25.4 (2018): 1636–1650.

--

--