Published in


EDSR : A Machine Learning Model for Super-resolution Image Processing

This is an introduction to「EDSR」, a machine learning model that can be used with ailia SDK. You can easily use this model to create AI applications using ailia SDK as well as many other ready-to-use ailia MODELS.


EDSR (Enhanced Deep Residual Networks for Single Image Super-Resolution) is a machine learning model released in July 2017 which can be used to increase the resolution of an image.



EDSR is a super-resolution model proposed after SRResNet. SRResNet successfully solved the problems of processing time and memory consumption, but ResNet used in SRResNet is a model architecture for image classification, which is not optimal for super-resolution.

Therefore, EDSR builds a more optimal model for super-resolution by removing unnecessary modules from ResNet. For example, BatchNormalization is removed because it loses range flexibility, based on the research of Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring.


ResNet also has a problem that learning becomes unstable when the number of feature maps is increased. To address this problem, the residuals is scaled down with a factor of 0.1 as proposed in Inception-v4.

3.3. Scaling of the Residuals

Also we found that if the number of filters exceeded 1000, the residual variants started to exhibit instabilities and the network has just “died” early in the training, meaning that the last layer before the average pooling started to produce only zeros after a few tens of thousands of iterations. This could not be prevented, neither by lowering the learning rate, nor by adding an extra batch-normalization to this layer.

We found that scaling down the residuals before adding them to the previous layer activation seemed to stabilize the training. In general we picked some scaling factors between 0.1 and 0.3 to scale the residuals before their being added to the accumulated layer activations (cf. Figure 20).

EDSR introduces a constant scaling layer of 0.1 at the output of the last convolution layer for the redisual block to make the training more stable.


In addition, conventional machine learning models for super-resolution are learning-sensitive, and a small change in architecture can have a large impact on image quality. Therefore, even if the same model is used, a large difference in image quality can appear depending on the initial value of the weights and the technique used during training. EDSR achieves stable learning by learning at x2 and then learning at x3 and x4 with the weights of x2.

The DIV2K data set was used for training.


You can use the following command to apply super-resolution processing to a video, specifying a scale factor of 2 to 4.

$ python3 -v input.mp4 -s output.mp4 --scale 3

Here is an example of the result.

Due to its nature, EDSR tends to emphasize noise when the image contains noise. Therefore, when reducing an image for evaluation, make sure to use a high quality reduction method.

ax Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.

ax Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store