Understanding — “Using bayesian belief networks for credit card fraud detection”

Published in

It’s all about feature engineering

3 min readMar 20, 2018

Original Paper : https://www.researchgate.net/publication/262175453_Using_Bayesian_Belief_Networks_for_credit_card_fraud_detection

This is a decade old paper. I liked the idea of using “Minimum Description Length” to learn the Bayesian network structure. This paper is a little sparse on the implementation details. Author tries to compare the Bayesian technique which depends on the conditional dependencies with Naive Bayes which uses conditional Independence.

Let us dig into the paper..

Bayesian Belief networks are good to identify anomalous events, and the results are also highly explainable. Unlike the frequentist method the notion of probability is intuitive. Pg[2]

Minimum Description Length (MDL) principle:

MDL is the key technique this paper has used. Concept uses information theory and Occam’s razor. An efficient network is the one which requires minimum description. The notion of length in this paper is the “amount of bits needed to keep the network in memory”. Details about building various combinations of such networks are sparse. Pg[3]

Input data Discretization Strategy :

Representation of the input data and the discretization strategy is key for calculating the probabilities. Author follows a simple binning strategy. Here are the details. Pg[3,4]

Details about the dataset :

This is one more area of concern for me. The sample was too small. Pg[5]

Training and Inference :

Details about the training are not provided. Two threshold values “minimum legal probability” and “maximum fraud probability” are used during inference. It looks like they have arrived completely heuristically. It would have been good if Precision, Recall curves were used to arrive at these values. But we have to keep a point in context is this paper is decade old. Pg[4]

Even though the implementation details are sparse. Overall the idea and approach made it worth grokking and sharing this paper.