WHAT IS MODE COLLAPSE IN GANS?

Miray TOPAL
6 min readJul 23, 2023

--

In my first article about GANs, we explored how Generative Adversarial Networks (GANs) functioned and their potential for generating diverse data samples. We will now take a look at a critical issue GANs known as “mode collapse” and examine the consequences and possible solutions to this phenomenon.

Understanding GANs: A Brief Overview

Before we embark on our journey to explore mode collapse, let’s revisit the essence of GANs. A GAN consists of two main components: the generator and the discriminator. Generator takes random noise as input and attempts to produce data samples that resemble real data. Meanwhile, Discriminator’s objective is to distinguish between real data samples from the training dataset and fake data samples generated by Generator.

In Generative Adversarial Networks (GANs), the loss function drives training by making the generator produce realistic data to deceive the discriminator. The loss function for Generative Adversarial Networks (GANs) consists of two components: the discriminator loss and the generator loss. GANs are designed as a minimax game, where the generator tries to minimize its loss while the discriminator aims to maximize its loss. The overall objective is to train the generator to produce realistic data samples that can deceive the discriminator.

Training Procedure:

Figure 1: GAN architecture(source)

During training, the discriminator and the generator are updated alternatively in a series of mini-batch iterations. In each iteration, a batch of real data samples and a batch of noise samples are fed to the discriminator. The discriminator is then updated to minimize its loss, while the generator is updated to minimize its loss. Gradient descent is used to update the model parameters, minimizing the loss during training. (If you want to learn more, you can check out my first article) Local minima, points with low loss in a specific region, can be challenging in GANs and may lead to issues like mode collapse.

Mode Collapse: A GAN’s Achilles’ Heel

Mode collapse in Generative Adversarial Networks (GANs) can be likened to a talented artist who creates a popular artwork. After the work gains popularisation, the artist may be afraid to take risks and start producing similar works. This can lead to a lack of artistic exploration and eventually bore the audience.

Similarly, in GANs, mode collapse happens when the generator focuses on producing a limited set of data patterns that deceive the discriminator. It becomes fixated on a few dominant modes in the training data and fails to capture the full diversity of the data distribution.

Just like the artist’s fear of trying new styles, the generator avoids exploring the entire data distribution in GANs. It produces repetitive data samples, missing out on the rich variations present in the real data.

Illustrating Mode Collapse:

To better grasp the concept, let’s consider a dataset with eight distinct Gaussian distributions. During GAN training, the generator might produce data in various modes initially but eventually converge to generating data solely in a single mode to deceive the discriminator. This could mean the discriminator is at of local minima of its cost function. When GANs find a local minimum for a mode, it means that during the training process, the generator gets stuck producing data that corresponds to only one mode of the target data distribution. This situation arises when the optimization algorithm used to train the GAN converges to a suboptimal solution, rather than reaching the global optimum. Consequently, the generator never achieves the expected distribution and gets fixated on generating data from a specific mode.

Figure 2: An illustration of the mode collapse problem on a two-dimensional toy dataset (source)

It’s a bit complicated, isn’t it? But don’t worry, as I usually say, if you don’t understand a subject, you haven’t come across enough examples. So let’s continue with a different example.

In this example using the MNIST dataset, we work with grayscale images of handwritten digits (0 to 9) and their corresponding labels. The dataset has a probability density distribution with 10 modes, each representing one of the digits.

Picture a skilled discriminator, trained to discern real handwritten digits from the generated ones. Initially, the discriminator performs well in classifying most digits correctly. However, it struggles with images resembling the digits one and nine. This difficulty might indicate that the discriminator is stuck in a local minimum of its cost function.

Figure 3: Generated images by GAN and WGAN models trained on MNIST after 1,100k,500k,1000k iterations.(source)

The discriminator’s misclassification of certain digits is then relayed to the generator as feedback. The generator, seeking to improve its performance, notices the discriminator’s weakness with images resembling the digit one. Consequently, the generator produces only those specific images, collapsing to a single mode that represents the handwritten number one or even the whole distribution of ones.

While the discriminator might eventually adapt and overcome this deception, the generator could encounter other issues. It might shift to a different mode in the distribution and collapse to that new representation, or it might struggle to find ways to diversify its generated samples.

In summary, modes in the context of probability distribution represent peaks related to each possible class or category. Mode collapse in GANs occurs when the generator learns to fool the discriminator by generating examples solely from one specific class, disregarding the full diversity of the training dataset. This is a problem because, instead of generating a wide variety of realistic samples, the generator becomes fixated on replicating a single class, limiting the GAN’s overall performance and potential.

Addressing Mode Collapse: Mitigation Strategies

As researchers continue to unravel the intricacies of GANs, several techniques have been proposed to mitigate mode collapse.

Wasserstein GANs are one of the ways to prevent mode collapse from occurring. Wasserstein GANs (WGANs) are an extension of traditional GANs that use the Wasserstein distance as the loss function instead of traditional cross-entropy. By using Wasserstein distance, WGANs provide a more stable and informative training signal, allowing for smoother learning and reduced mode collapse. The gradient of the Wasserstein distance enables better convergence, making WGANs effective in handling mode collapse and generating more diverse and realistic samples.

Going back to our example, if you look at the image after applying WGAN(Wasserstein GANs) to the MNIST dataset, you can see that it is a good solution for the mode collapse issue. The generator network has produced outputs that will cover the richness and diversity in the training set. (Figure 3)

Unrolled GANs address the mode collapse problem by using a generator loss function that takes into account not only the current discriminator’s classifications but also the outputs of future discriminator versions. This prevents the generator from over-optimizing for a single discriminator and encourages it to produce more diverse and realistic samples, reducing the likelihood of mode collapse.

Overcoming mode collapse is an active area of research in the GAN community. Apart from these methods, many different methods have also been proposed to encourage the GAN to explore multiple modes of data distribution during training, such as updating learning rates, modifying network architectures, and employing regularization methods.

I hope you found this article on mode collapse in GANs insightful and informative. Thank you for reading! :)

You can follow me on Github: https://github.com/miraytopal

References:

[1] https://arxiv.org/pdf/1807.04015.pdf

[2] https://arxiv.org/pdf/2001.08873.pdf

[3] https://www.coursera.org/learn/build-basic-generative-adversarial-networks-gans

[4]https://web.cs.ucla.edu/~srinath/static/projects/CS260_Course_Project_Report_.pdf

[5] https://developers.google.com/machine-learning/gan/problems

--

--