Paper review-AsynDGAN:Train deep learning Without Sharing Medical Image Data

yw_nam

Published in

Analytics Vidhya

5 min readJul 28, 2020

Every Figures, Tables are come from the paper. (Marked if it is from another paper or other website.)

Content

Abstract
Method
Result and Experiments
My Opinion

1.Abstract

Fig 1. Compared Synthesis c brain tumor image by AsynDGAN and Real image

Fig 2. Compared Synthesis nuclei images by AsynDGAN and Real image

This paper is accepted by CVPR 2020. In general, since medical data is related to the patient’s privacy, it is often impossible to share with others. Therefore, because the published medical data is small, the authors argue that models that require a lot of data, such as deep learning, are difficult to apply to medical data.

The authors argue that learning the central generator G from a distributed discriminator, and using the image created from G, enables learning without data sharing between hospitals and solves privacy problems.

They also said, the proposed model can solve the following problems:

Our experiments show that our approach could learn the real image’s distribution from multiple datasets without sharing the patient’s raw data
Our experiments show that our approach more efficient and requires lower bandwidth than other distributed deep learning methods
Our experiments show that our approach achieves higher performance compared to the model trained by one real dataset, and almost the same performance compared to the model trained by all real datasets
Our experiments show that our approach has provable guarantees that the generator could learn the distributed distribution

The code is available here

2. Method

As shown in Fig. 2, Central generator G receives task specific input (segmentation in this paper). G creates a synthesis image to fool the local discriminator (D_1, D_2, …, D_n.). D_n needs to discriminate between synthesis data(x) and real data(G(x_n)). Between G and D, only Gradient and Synthesis images are transferred. Therefore, the authors argue that data privacy is not violated because only local medical entities access their own real data (G(x_n)).

Objective of AsynDGAN

Eq 1. The objective of a classical conditional GAN

In AsynDGAN, G is supervised by N different D. Each D is associated with a subset of the dataset. Therefore, s(x) can be expressed as follows.

Eq 2. mixture distribution on auxiliary variable x

Therefore, the Loss function can be written as follows.

Optimization process

In Fig 3, The solid arrows show the forward pass, and the dotted arrows show gradient flow during the backward pass of our iterative update procedure. The solid block indicate that it is being updated while the dotted blocks mean that they are frozen during that update step. Red and blue rectangles are source mask and target real image, respectively.

Model update follows the following process.

D-update: Calculating the adversarial loss for j-th discriminator D_j and update D_j , where j = 1, 2, · · · , N.
G-update: After updating all discriminators, G will be updated using the adversarial loss as follow.

Eq 4. adversarial loss

This can be described as follow.

3. Result and Experiments

Data.

The authors used Synthetic dataset, BraTS2018, Multi-Organ.
Synthesis dataset is created by combining 3 one-dimensional Gaussian.
That is, y= ∑ (y_j + E_x=j). At this time, each y_j follows y_1~N (−3, 2), y_2 ~N (1, 1), and y_3~N (3, 0.5). j = {1, 2, 3}.

Experiment on synthetic dataset

setting

Syn-All: Training a regular GAN using all samples in the dataset.
Syn-Subset-n: Training a regular GAN using only samples in local subset n, where n ∈ {1, 2, 3}.
AsynDGAN: Training AsynDGAN using samples in all subsets in a distributed fashion.

Result

Fig 5.Generated distributions of different methods

In Fig. 4, assuming that a is the baseline, the result of c looks better than the result of b.

Experiment on Brain tumor segmentation, Nuclei segmentation

Setting

Real-All: Training using real images from the whole train set
Real-Subset-n: Real-Subset-n. Training using real images from the n-th subse, where n = 1, 2, · · · , 10. for Brain tumor segmentation and
n ∈ {breast, liver, kidney, prostate}. for Nuclei segmentation
Syn-All: Training using synthetic images generated from a regular GAN. The GAN is trained directly using all real images
AsynDGAN: Training using synthetic images from proposed AsynDGAN

Result of Brain tumor segmentation

Fig 6. Typical brain tumor segmentation results.

Table 1. Brain tumor segmentation results.

Result of Nuclei segmentation

Result

Of course, Real-All shows the best performance. However, there are privacy issues, so many datasets cannot be accessed. Therefore, in reality, the model shows the performance of Real-subset.

However, AsynDGAN solves the privacy issue and gets better than Real-Subset-n. Also, it shows results similar to Syn-All that synthesizes after learning using all real data.

My opinions

The idea of this paper give us important insight about applying deep learning to medical data.

However, to actually implement, the process of updating G from the Distributed Discriminator, that is

But, In practical, the gradient and synthesis image should be transferred to G from D by several medical entities. This implementation looks very challenging task.

Paper review-AsynDGAN:Train deep learning Without Sharing Medical Image Data

Content

1.Abstract

2. Method

Objective of AsynDGAN

Optimization process

3. Result and Experiments

Data.

Experiment on synthetic dataset

setting

Result

Experiment on Brain tumor segmentation, Nuclei segmentation

Setting

Result of Brain tumor segmentation

Result of Nuclei segmentation

Result

My opinions

Written by yw_nam