Published in


Combining Structure Similarity Index with Histogram Matching to Image-based Fraud Detection

Photo by Ali Bakhtiari on Unsplash

Fraud detection is strategies to prevent uncovered activities in an organization, such as fraudulent credit card transactions, identify theft, insurance scams, and so on. A segment for fraud detection is identify altered or processed images using computer vision.

In this post, I will show you a basic way to image-based fraud detection combining two tools:

  • Structural Similarity Index (SSIM);
  • Histogram Matching.


SSIM is a metric that extracts luminance, contrast, and structure features from images. Thus, comparing a reference image with a processed image, it is possible to measure the difference between them. It takes into account that the two images need to be from the same image capture. The result often is a value between 0 to 1, where 0 indicates no structural similarity and 1 indicates perfectly structural similarity. For more details about SSIM I suggest you read this post.

Given two images, a reference and an altered or processed one. The images are the same. However, in the processed image, I added a simple blur filter to show what happened with SSIM in this case. I will apply the SSIM in two situations: 1) comparing reference-reference images and 2) comparing reference-processed images.

Fig 1: a) Comparing Reference — Reference images and b) comparing Reference — processed images. Source: Photo by Nhia Moua on Unsplash.

The SSIM result for Fig. 1a is 1.0, while in Fig. 1b is 0.79. That’s nice!!! It is really what we want to see! The first result makes sense because we are comparing the same image, and the second result shows that the blur filter was enough to decrease the structural similarity. When the automatic fraud detection processes the second situation, it might be a problem whether the “threshold” is very high (like 0.9) to approve if the image is a fraud or not. It could generate false negatives. To figure it out, we are going to the next step of this post: histogram matching.

Histogram matching

An image histogram is an intensity distribution of an image, it can be for a grayscale or color image. Histogram matching is a “transformation” of an image using the histogram from another. In other words, histogram matching is a process that transfers the distribution of pixel intensities from image A to image B. More details are shown in this post.

Applying histogram matching for our problem allows us to obtain interesting results, as you can see in the Fig 2.

Fig 2: a) Reference image, b) Processed image, and c) Matched is histogram matching image. Source: Photo by Nhia Moua on Unsplash.

It is notable that applying histogram matching on processed image using the reference image histogram the output is an image very similar to the reference image.

Finally, comparing Fig 2 a, b and c with SSIM we have the following result:

Table: Fig. 2 images, a — Reference, b — processed and c — matched images

As we can see, applying histogram matching on the processed image using the reference image histogram, the SSIM improved such that the output is almost perfectly structural similar to reference image. It is nice! This means that histogram matching was enough to figure out the blur filter added in the reference image and we reached our goal!

The combination SSIM + histogram matching might be used in different types of image contrasts. In this article, I showed just a situation when the processed image has a blur filter. Of course that in real life you need to use more tools than the ones presented here, however, it is a good start.


In this article, we covered a useful combination between SSIM with histogram matching to image-based fraud detection and a basic application in a processed image with blur filter.





O seu negócio ainda mais inteligente com Big Data, IA e Quantum Computing

Recommended from Medium

Video Resolution Upscaling Using Neural Networks

Building Full-Stack Serverless NLP Applications with JavaScript & AWS

Machine Learning Models for Detecting Diabetes.

Shall we build transparent models right away?

How Do You Measure If Your Customer Churn Predictive Model Is Good?

Motivation — Where is Linear Algebra in machine learning?

Using Machine Learning to Understand Comorbidity Factors for COVID-19

Improving Walmart Search to help our customers save time!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Wagner Massayuki Nakasuga

Wagner Massayuki Nakasuga

Data Scientist at Semantix

More from Medium

Improve model performance using semi-supervised wrapper methods

Realtime Gender and Age Detection Using Wide Residual Networks(WRN)

Core ML: Deploy your model on-device

Building A Recommender System With Implicit Feedback Datasets Using Alternating Least Squares