Image Super Resolution Using GANs
Introduction
What is Super Resolution ?
Image super resolution is a technique of reconstructing a high resolution image from the observed low resolution image.Most of the approaches for Image Super Resolution till now used the MSE (mean squared error )as a loss function , the problem with MSE as a loss function is that the high texture details of the image are averaged to create a smooth reconstruction .
GANs solve this problem by using the perceptual loss which drives the image reconstruction towards the natural image manifold producing perceptually more realistic and convincing solutions .
What are GANs ?
Generative adversarial networks (GANs) are algorithmic architectures that use two neural networks, pitting one against the other (thus the “adversarial”) in order to generate new, synthetic instances of data that can pass for real data.
The two neural networks are the generator and the discriminator . The generator tries to produce a new data instance and the discriminator tries to distinguish whether the data belongs to the training data set or not .
links to learn more about GANs :
Super Resolution GANs :
Super-resolution GANs apply a deep network in combination with an adversarial network to produce higher resolution images. As mentioned above, SR GANs tend to produce images which are more appealing to humans with more details compared to an architecture built without GANs.
Method
1)Introduction
SR-GANs consist of two networks a generator and a discriminator.
The discriminator(critic) is trained to differentiate between the real HR image and the generated image and is implemented using CNNs .
The generator is used to take input LR images and generate HR output images which is sent to the critic for evaluation and is implemented using ResNet blocks.
The loss function is a multi component loss function consisting of content loss and adversarial loss.
2)SR GAN Architecture
3)Loss Function
Perceptual loss function
Previous approaches to image super resolution based their approach on MSE , but we improve upon this approach by using the perceptual loss function . The perceptual loss function generates images with respect to perceptually relevant characteristics.
Perceptual Loss is the weighted sum of content loss and adversarial loss
Experiment
Dateset used
The SRGAN was trained on the DIV2K data set , the data set consists of 800 HR images .
The images are down sampled and then used for training . The data set has a large variety of images .
Training
The network was trained on google colab on the DIV2K data set for 350 epochs .
Results
Link to the code
References and further reading