A Quick Overview of Methods to Measure the Similarity Between Images

When you work with computer vision challenges, you must choose a method for measuring the similarity between two images to compare different results of your experiments. Let’s consider different methods in this note.

The most traditional estimator is mean-square error (MSE). MSE measures the average squared difference between the estimated values (predicted values) and the actual value (ground truth). So we just calculate squared differences pixel by pixel. But this works well only if we want to generate a picture with the best pixel colors conformity with the ground truth picture. Sometimes we want to concentrate on the structure or relief of the picture.

PSNR (Peak Signal to Noise Ratio) is the second traditional estimator. To use this estimator we must transform all values of pixel representation to bit form. If we have 8-bit pixels, then the values of the pixel channels must be from 0 to 255. By the way, the red, green, blue or RGB color model fits best for the PSNR. PSNR shows a ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. PSNR is often used to control the quality of digital signal transmission.

But PSNR is a variation of the MSE and still concentrates on pixel-by-pixel comparison. Let’s consider the structural similarity method (SSIM). SSIM is correlated with the quality and perception of the human visual system (HVS color model). Instead of using traditional error summation methods, the SSIM models image distortion as a combination of three factors that are loss of correlation, luminance distortion, and contrast distortion.

Some studies have revealed that as opposed to the SSIM, the MSE, and the PSNR perform badly in discriminating structural content in images since various types of degradations applied to the same image can yield the same value of the MSE. Other studies have shown that the MSE, and consequently the PSNR, have the best performance in assessing the quality of noisy images.
___________
Resources:
Lu, Yingjing. “The Level Weighted Structural Similarity Loss: A Step Away from MSE.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019
Hore, Alain, and Djemel Ziou. “Image quality metrics: PSNR vs. SSIM.” 2010 20th International Conference on Pattern Recognition. IEEE, 2010.

Data Monsters found a challenge in computer vision with picture generation, and we tried to use all of these metrics right now. It looks something like this comic.