Pixel Perfect: ESRGAN-powered High-Resolution Image Upscaling Platform

Published in

AI Skunks

3 min readMar 15, 2023

How Does the Application work?

The MLOps-based application is designed to upscale images by a factor of 4 using ESRGAN, a deep learning-based technique for image super-resolution. The application is hosted on Render’s web services and is implemented as a Flask-based API. Users can upload their low-resolution images to the API, and the application will use ESRGAN to upscale them to four times their original resolution. The API is optimised for scalability and can simultaneously handle large volumes of image requests. With this application, users can easily enhance the quality of their images and produce high-resolution versions for their various use cases.

What is ESRGAN?

ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks) is a type of deep learning-based model that is used for image super-resolution, which means increasing the resolution of low-resolution images. ESRGAN is an improvement over traditional super-resolution techniques, which often result in blurry and unrealistic images.

ESRGAN uses a Generative Adversarial Network (GAN) to generate high-resolution images from low-resolution inputs. GANs consist of two neural networks, a generator and a discriminator, that work together to create high-quality images. The generator network learns to create realistic images, while the discriminator network learns to distinguish between real and generated images. Through a process of competition and feedback, the generator network becomes better at creating high-quality images.

ESRGAN improves upon traditional GAN-based super-resolution models by using a combination of perceptual loss, which measures the similarity between high-resolution and generated images in terms of visual features, and adversarial loss, which measures the ability of the generator to fool the discriminator. This approach results in images that are not only high-resolution but also visually pleasing and realistic. ESRGAN has been used in various applications, such as enhancing low-quality images from security cameras or improving the resolution of medical images.

Application Build

This is a Flask application that implements an image upscaling using ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks). The application consists of a single webpage where users can upload images that will be processed by the ESRGAN model and saved to a specified folder. The application then displays the original and upscaled images on the same webpage.

The application has the following functionality:

Users can upload one or more images to be upscaled.
The application processes the images using an ESRGAN model and saves the upscaled images to a specified folder.
Users can view the original and upscaled images on the same webpage.
Users can delete all uploaded and upscaled images at once.

The application uses the following technologies:

Flask: A micro web framework for Python.
OpenCV: An open-source computer vision and machine learning software library.
PyTorch: An open-source machine learning library based on the Torch library.
HTML, CSS, and JavaScript: Front-end technologies for building the webpage.

We have utilized an ESRGAN model that was pre-trained for our purposes.

Future Scope and Work

I aspire to create a cloud-based web service and mobile application capable of meeting the high processing demands of our use case.

Currently, The application is a client-side, locally hosted solution and it has also been deployed on Render’s Cloud platform as a web service, but unfortunately, it experiences timeouts due to its intensive CPU/ GPU requirement to leverage ESEGAN.

GitHub Repository link — https://github.com/Pratik-Prakash-Sannakki/WhatsApp-Insta_Resolution_Pumper

References

https://github.com/xinntao/ESRGAN