CERN | GSoC’19: Generative Adversarial Networks for Particle Physics Applications

Ashish Kshirsagar
Analytics Vidhya
Published in
6 min readAug 25, 2019

Three months ago, I was selected as a Google Summer of Code student for CERN-HSF to work on the project ‘Generative Adversarial Networks ( GANs ) for Particle Physics Applications’ in the TMVA ( Toolkit for Multi-Variate Analysis ) library. The journey was really amazing, I enjoyed and learned a lot working under the CERN organization.

ROOT is a modular scientific toolkit developed by CERN that provides all the functionalities needed to deal with big data processing, statistical analysis, visualization, and storage. TMVA is a ROOT integrated project which provides a machine learning environment for training and evaluation of multivariate classification and regression targeting applications in high energy physics.

The main goal of the project was to build a foundation for Generative Adversarial Networks and developing a model for GANs in TMVA.

Introduction to Generative Adversarial Networks

Generative Adversarial Networks are one of the most interesting deep learning models. Two models are trained jointly through an adversarial process:

  1. A generator learns to create images that look real.
  2. A discriminator learns to tell whether the image is real (from the dataset) or fake (from the generator).

The Discriminator model

The functionality of this model is similar to Convolutional Neural Network ( CNN ). A discriminator is a CNN consisting of many hidden layers and one output layer. The discriminator learns how actual images look like and what features the real data should contain, by training it on real data. Thus, the discriminator’s output can be either 0 or 1, where 0 means that the discriminator classifies the input data as fake (sampled from the generator) and 1 means that it classifies the input data as real (sampled from the real distribution)

The Generator model

The generator is responsible for generating images, effectively inverting the functionality of a CNN. In a CNN, we pass an image as input and a label of the image is expected as output. In a generator instead, a randomly generated noise is given as an input.

The generator then transforms this noise into a meaningful output. The noise is sampled usually from distribution at a smaller dimension than the dimensionality of the output space.

Training GANs

During training, the generator progressively becomes better at creating images that look real, while the discriminator becomes better at differentiating between generated and actual samples. The equilibrium is reached when the discriminator can no longer distinguish real images from fakes.

Layers for GANs

As we know, the generator is required to increase the dimensionality of the output during the course of the generation of the image, therefore, we employ layers in the generator model that do so.

TMVA had layers like Convolution Layer and Maxpooling Layer that downsamples the input tensor and convert it into a lower dimension tensor. Hence, we introduced layers like Upsample and Transpose Convolution Layer, which would instead upsample a given tensor and thus, we can obtain tensor of higher dimensions.

Structure of a Layer in TMVA

TMVA does not support a 3D tensor data structure (although RTensor is currently being introduced). Thus, for any 3D input, we need to convert the 3D input tensor to a 2D matrix and then the corresponding operations can be performed. Each layer performs a corresponding forward pass and a backward pass.

Upsample Layer

The upsample layer is a simple layer that usually does not require any filter/ weight. It increases the dimension of input and thus can be used in generative models followed by a convolutional layer. It can also be considered as an inverse of a pooling layer.

Working

A simple variation of upsampling is introduced here which is the nearest neighbor interpolation technique. In the nearest-neighbor interpolation, we determine the value of each pixel in the output image as the value of the nearest neighbor in the input image as shown in the below figure.

Consider a simple example of nearest-neighbor interpolation, where we upsample a matrix A (3x3) to a higher dimensional matrix B (6x6).

Nearest Neighbor Interpolation

Transpose Convolution Layer

The transpose convolution operation is equivalent to the gradient calculation for a regular convolution (backward pass of a normal convolution). We can use it to increase the dimensionality of the tensor instead of using a pre-defined interpolation technique.

Working

The transpose convolution technique used here is described best with an example: Consider an input matrix A (2x2) and a kernel matrix B (4x4).

Generating a Transpose Convolution Matrix

We express a transpose convolution operation using a matrix. It is nothing but a kernel matrix rearranged such that we can use matrix multiplication to perform transpose convolution operation. In the above example, in order to get an output vector of size 7x1, we generate a transpose convolution matrix of size 7x4.

Here, we rearrange the 4x4 kernel into a 7x4 kernel.

Converting the input into a single columnar vector.

Here, we flatten the input matrix (2x2) into a column vector (4x1)

Generating the corresponding output vector

Thus, we generate an output vector by multiplying the transpose convolution matrix and the input vector.

GANs in TMVA DNN modules

The class MethodGAN ( GAN module ) contains the actual implementation for GANs. MethodGAN provides flexibility for the user to define his own architecture for GANs, as the architecture is passed onto the model as input and MethodGAN consists of various parsing functions that create the model after parsing the architecture input string.

In Particle Physics, Generative Adversarial Networks can be immensely useful in generating images as an alternative for experiments where we require to perform a large number of experiments and generate results in the form of images.

The architecture of GANs in TMVA

Future Work

  • Implement separate Loss functions for Generator and Discriminator.
  • Adding support for other variations of GANs for high energy physics applications.
  • Benchmarking the results with other standard implementations.

Acknowledgments

I would like to acknowledge and extend my heartfelt gratitude to all my mentors Lorenzo Moneta, Manos Stergiadis, Sergei Gleyzer, Omar Andres Zapata Mesa, Sitong An, Stefan Wunsch, Kim Albertsson and Gerardo Gutiérrez for guiding me throughout the development process and helping me with implementation decisions and dealing with technical issues and providing valuable inputs during weekly meetings.

I would also like to thank Anushree Rankawat for helping me understand the existing codebase and helping with other implementation issues.

Google Summer of Code helped me to be a part of such a great community and gave me a chance to work on such an exciting project.

Also, it is a great experience working with Surya Dwivedi.

Important Links:

Link to implementation of Upsample layer and Transpose layer: PR-4146

Link to GANs code: GANs implementation

Other links: PR-4275

--

--