Member-only story

The main issue with identifying Financial Fraud using Machine Learning (and how to address it)

Strategies for dealing with imbalanced data

gustavo
TDS Archive
6 min readMar 6, 2019

--

The sheer amount of financial transactions that payment processors deal with on a daily basis is staggering, and only increasing: in the order of 70 million credit card transactions per day in 2012 and with losses in the billions of dollars in 2017. Determining if a transaction is legitimate or fraud is a job exclusively for a computer system simply due to volume. The traditional machine learning approach is to build a classifier that helps the human in the loop to reduce the number of transactions that it has to look at.

The goal of the machine learning classifier is to reduce the number of transactions that a human has to investigate.

The challenge for machine learning classifiers is that the percentage of fraudulent transactions is in the order of 1–2%, which means that classifiers have to consider a severe imbalance in the training data.

This is an awesome video that shows the challenges that machine learning engineers have to go through while systematically detecting fraudulent transactions:

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

gustavo
gustavo

Written by gustavo

Data Science @ Medium. Views are my own.

No responses yet