Vitalify Asia
Published in

Vitalify Asia

GAN for unsupervised anomaly detection on X-ray images.

An attempt at using Generative Adversarial Network to do more than just generating cool images.

Why anomaly detection on X-ray images

  • Abnormal medical cases are usually much rarer than normal cases so it is usually heavily skewed toward normal cases (negative samples). It is very time-consuming to collect a reasonable amount for all cases.
  • For supervised ML/DL approach, the amount of data collected need to be labeled by qualified physicians/doctors as well. So it’s very expensive to create labels for medical datasets.

About GANs

GAN-generated dog-ball. Source: BigGAN

Why GANs

Anomaly Detection strategy:

  1. Train GAN to generate only normal X-ray images (negative samples).
  2. When predicting anomaly, use GAN to reconstruct the input images of both normal and abnormal images (negative and positive samples).
  3. Compute reconstruction, feature matching and discrimination losses.
  4. Discriminate between normal and abnormal cases using these statistic.
  • Reconstruction loss are the differences between original and reconstructed images.
  • Feature matching losses are the differences between encoded features of hidden layers in the Encoder and Discriminator.
  • Discrimination loss is simply the output of the Discriminator.


Vanilla GAN architecture. Source: Mihaela Rosca 2018
  • Generator that outputs random images samples from random latent vectors.
  • Discriminator that classifies real and fake samples.

Bi-directional GAN

Generator G, Encoder E and joint Discriminator D. Source: BiGAN


Encoder, Generator, Discriminator D and Code Discriminator C. Source: Mihaela Rosca 2018

Generative results:

  • The reconstructed samples are generated from the latent variables z of image: x -> E(x) -> G(E(x))
  • The generated samples are created from the latent variables z sampled randomly: z -> G(z)


BiGAN samples.
BiGAN generated samples from the same latent variables over time.
  • It can be seen that the reconstructed samples of BiGAN do not closely resemble the original input images on the left. This can be explained by the architecture of BiGAN in which the Encoder and and Generator does not interact with each other during training.
  • The generated samples also lack the diversity in postures and brightness when compare with the original data distribution.


Alpha-GAN samples.
Alpha-GAN reconstructed samples over training epochs.

Discriminative results:


BiGAN statistics on validation set.


Alpha-GAN statistic on validation set.

Problems and challenges

  • Mode collapse: When GAN only generates a small subset of possible data space. This happen when the Generator sacrifices diversity for quality.
Loss of Encoder and Generator suddenly drop during training.
  • Discriminator assumption: In this recent work, the authors show that the same Discriminator implies different discriminative boundaries when initialized randomly. While these random assumptions by the Discriminator are good enough for its original purpose of generating realistic images, it will be a problem for other purposes such as detecting anomlies.
  • Sparse latent space: The input data can be reconstructed from the encoded latent vectors but GANs still lack the incentive to have a well generalized latent space. This phenomenon can be observed from the randomly generated samples in Alpha-GAN.




Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store