The #paperoftheweek is “Diversity-Sensitive Conditional Generative Adversarial Networks”
In Conditional GANs, the generator is not only fed with a randomly sampled latent but also with extra information. This extra information should drive the generator to produce samples that meet certain conditions. For example, a generated image should belong to a certain class or a generated image should have the same edges as are given in the conditioning input edge map.
Conditional GANs perform well to generate different images. However, within a class, mode collapse is still a problematic phenomenon.
The authors of our weekly paper add a simple regularization term to the generator loss to promote diversity and thereby preventing mode collapse. The regularization term is based on the idea that the distance between generated samples within the same class should correlate with the distance between their respective latent variables.
The authors successfully apply their new regularization term to three distinct conditional image generation problems: image-to-image translation, image inpainting, and future video prediction. This indicates that the approach generalizes over several applications. They demonstrate the improvement in diversity to the multi-modal output compared to the baseline
“We propose a simple yet highly effective method that addresses the mode-collapse problem in the Conditional Generative Adversarial Network (cGAN). Although conditional distributions are multi-modal (i.e., having many modes) in practice, most cGAN approaches tend to learn an overly simplified distribution where an input is always mapped to a single output regardless of variations in latent code. To address such issue, we propose to explicitly regularize the generator to produce diverse outputs depending on latent codes. The proposed regularization is simple, general, and can be easily integrated into most conditional GAN objectives. Additionally, explicit regularization on generator allows our method to control a balance between visual quality and diversity. We demonstrate the effectiveness of our method on three conditional generation tasks: image-to-image translation, image inpainting, and future video prediction. We show that simple addition of our regularization to existing models leads to surprisingly diverse generations, substantially outperforming the previous approaches for multi-modal conditional generation specifically designed in each individual task.”
You can read the full article here.
About the author:
Elias Vansteenkiste, Lead Researcher Scientist at Brighter AI.
About Brighter AI:
Brighter AI has developed an innovative privacy solution for visual data: Deep Natural Anonymization. The solution replaces personally identifiable information such as faces and licenses plates with artificial objects, thereby enabling all AI and analytics use cases, e.g. self-driving cars and smart retail. In 2018, NVIDIA named the German company “Europe’s Hottest AI Startup”.