Detecting Credit Card Fraud Using Machine Learning

Credit card transactions have revolutionized the way we handle money. With just a swipe, tap, or click, we can make purchases and payments effortlessly, both online and offline.

Published in

Operations Research Bit

7 min readAug 7, 2024

Credit cards offer numerous advantages such as convenience, rewards, and the ability to build a credit history. They eliminate the need to carry large amounts of cash and provide an easy way to track spending through monthly statements.

However, this convenience comes with its own set of disadvantages. High-interest rates, potential for debt accumulation, and fees for late payments or exceeding credit limits can pose financial risks. Moreover, credit card transactions are prone to fraud, which is a growing concern in today’s digital world.

The Looming Threat of Credit Card Fraud

Credit card fraud occurs when unauthorized individuals gain access to card details and make fraudulent transactions. This can happen through various means such as phishing, skimming, or data breaches. Fraudsters continually devise new methods to exploit vulnerabilities, making it challenging for both consumers and financial institutions to stay ahead.

Impact on Individuals

The consequences of credit card fraud can be devastating for individuals. Victims may face financial losses, damage to their credit scores, and the stress of resolving fraudulent charges. Recovering from credit card fraud often involves time-consuming processes such as disputing transactions and possibly even legal action. The emotional toll can be significant, leading to feelings of violation and distrust.

Leveraging Machine Learning to Combat Fraud

Machine learning (ML) offers a powerful solution to predict and prevent fraudulent transactions. By analyzing patterns and anomalies in transaction data, ML models can identify potential fraud in real-time, reducing the risk of unauthorized transactions. Let’s explore how to build a credit card fraud detection model using Python in Google Colab.

Building a Fraud Detection Model in Python

Here’s a step-by-step guide to creating a credit card fraud detection model, complete with the necessary code.

Download the Dataset from Kaggle

First, we’ll download a publicly available dataset from Kaggle.

# Download the Dataset from Kaggle
!pip install kaggle
from google.colab import files

# Upload your kaggle.json file
uploaded = files.upload()

# Move kaggle.json to the correct directory
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

# Download the dataset (example: 'creditcardfraud' dataset)
!kaggle datasets download -d mlg-ulb/creditcardfraud

# Unzip the dataset
!unzip creditcardfraud.zip

Clean the Data

Next, we’ll read the dataset and check for null values.

# Clean the Data
import pandas as pd

df = pd.read_csv('creditcard.csv')

# Checking for null values
print(df.isnull().sum())

# No null values in this dataset, so no cleaning needed

Install Necessary Libraries

We need to install the required libraries.

# Install Necessary Libraries
!pip install scikit-learn numpy matplotlib

Split the Dataset into Training and Testing

We’ll split the data into training and testing sets.

# Split the Dataset into Training and Testing
from sklearn.model_selection import train_test_split

X = df.drop('Class', axis=1)
y = df['Class']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

Apply the Model on the Dataset

Now, let’s train a RandomForestClassifier model.

# Apply the Model on the Dataset
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

Generate Confusion Matrix and Performance Metrics

We can evaluate the model’s performance using various metrics.

# Generate Confusion Matrix and Performance Metrics
from sklearn.metrics import confusion_matrix, accuracy_score, f1_score, roc_auc_score, classification_report
import seaborn as sns
import matplotlib.pyplot as plt

y_pred = model.predict(X_test)

conf_matrix = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred)

# Generate and format the classification report
class_report_dict = classification_report(y_test, y_pred, output_dict=True)
class_report_df = pd.DataFrame(class_report_dict).transpose()

# Visualize confusion matrix using seaborn
plt.figure(figsize=(10,7))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['Non-Fraud', 'Fraud'], yticklabels=['Non-Fraud', 'Fraud'])
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

# Visualize the classification report
plt.figure(figsize=(10, 7))
sns.heatmap(class_report_df.iloc[:-1, :].T, annot=True, cmap='Blues', cbar=False, fmt='.2f')
plt.title('Classification Report')
plt.show()

# Other KPIs
print(f'Accuracy: {accuracy}')
print(f'F1 Score: {f1}')
print(f'ROC AUC Score: {roc_auc}')

Classification report (Image by the author)

Generate a Balanced Dataset Reflecting Type 1 and Type 2 Errors

To mimic the original dataset, we generate a balanced dataset and introduce Type 1 and Type 2 errors. We should understand why we need to balance a dataset.

The Importance of a Balanced Dataset

Addressing Imbalance

Fraud detection datasets are often highly imbalanced, with a very small percentage of transactions being fraudulent. This imbalance can lead to a model that performs well on the majority class (non-fraudulent transactions) but poorly on the minority class (fraudulent transactions).
A balanced dataset helps ensure that the model pays equal attention to both classes, improving its ability to detect fraud.

2. Improving Model Performance

Training on a balanced dataset helps the model learn to distinguish between fraudulent and non-fraudulent transactions more effectively.
It prevents the model from becoming biased towards the majority class and helps achieve better performance metrics such as precision, recall, and F1 score for the minority class.

I hope everyone understands the Type 1 and Type 2 errors. As a refresher, let us revise them.

Reflecting on Type 1 and Type 2 Errors

Understanding Errors:

Type 1 Error (False Positive): The model incorrectly predicts a non-fraudulent transaction as fraudulent.
Type 2 Error (False Negative): The model fails to identify a fraudulent transaction, predicting it as non-fraudulent.

2. Significance of Type 1 and Type 2 Errors:

Impact on Individuals

Type 1 Error: Causes inconvenience to the user, as legitimate transactions are flagged as fraudulent, potentially leading to declined transactions and the hassle of verifying identity.
Type 2 Error: More severe, as it allows fraudulent transactions to go undetected, leading to financial losses for individuals and financial institutions.

Balancing the dataset to reflect these errors helps in understanding how the model performs in real-world scenarios and ensures that both types of errors are adequately addressed.

3. Simulating Real-World Conditions

Introducing Type 1 and Type 2 errors in a controlled manner within the balanced dataset helps simulate real-world conditions where the model will not always be perfect.
This approach allows us to evaluate the model’s robustness and ability to handle errors, ensuring it performs well not only on the training data but also on unseen, real-world data.

4. Fine-Tuning the Model:

By reflecting the errors in the dataset, we can fine-tune the model to minimize these errors. For instance, adjusting the decision threshold can help find an optimal balance between precision and recall.
This leads to a more reliable and accurate model, reducing the risk of fraud and improving the user experience by minimizing false positives.

Therefore, generating a balanced dataset that reflects Type 1 and Type 2 errors is crucial in building an effective fraud detection model. It ensures that the model is not biased towards the majority class and adequately addresses both types of errors. This approach not only improves the model’s performance but also simulates real-world conditions, leading to a more robust and reliable fraud detection system. By carefully considering and reflecting these errors in the dataset, we can develop models that better protect individuals and financial institutions from the risks associated with credit card fraud.

# Generate a Balanced Dataset Reflecting Type 1 and Type 2 Errors
import numpy as np

# Get counts of each class in the original dataset
class_counts = df['Class'].value_counts()
fraud_count = class_counts[1]
non_fraud_count = class_counts[0]

# Generate balanced data
fraud_data = df[df['Class'] == 1]
non_fraud_data = df[df['Class'] == 0].sample(fraud_count, random_state=42)

balanced_data = pd.concat([fraud_data, non_fraud_data])

# Introducing Type 1 and Type 2 errors
# Assume similar error rates as found in the original model's performance

# Calculate error rates
type1_error_rate = conf_matrix[0, 1] / sum(conf_matrix[0])
type2_error_rate = conf_matrix[1, 0] / sum(conf_matrix[1])

# Introduce errors in the balanced dataset
balanced_data_errors = balanced_data.copy()
flip_indices_type1 = balanced_data_errors[balanced_data_errors['Class'] == 0].sample(num_type1_errors, random_state=42).index
flip_indices_type2 = balanced_data_errors[balanced_data_errors['Class'] == 1].sample(num_type2_errors, random_state=42).index

balanced_data_errors.loc[flip_indices_type1, 'Class'] = 1  # Introduce Type 1 errors (false positives)
balanced_data_errors.loc[flip_indices_type2, 'Class'] = 0  # Introduce Type 2 errors (false negatives)

# Split the new dataset
X_balanced = balanced_data_errors.drop('Class', axis=1)
y_balanced = balanced_data_errors['Class']

X_balanced_train, X_balanced_test, y_balanced_train, y_balanced_test = train_test_split(X_balanced, y_balanced, test_size=0.2, random_state=42, stratify=y_balanced)

Apply the Model on the Balanced Dataset

Finally, we test the model on the new balanced dataset.

# Apply the Model on the Balanced Dataset
y_balanced_pred = model.predict(X_balanced_test)

# Test the model on the balanced dataset
balanced_conf_matrix = confusion_matrix(y_balanced_test, y_balanced_pred)
balanced_accuracy = accuracy_score(y_balanced_test, y_balanced_pred)
balanced_f1 = f1_score(y_balanced_test, y_balanced_pred)
balanced_roc_auc = roc_auc_score(y_balanced_test, y_balanced_pred)

# Generate and format the classification report
balanced_class_report_dict = classification_report(y_balanced_test, y_balanced_pred, output_dict=True)
balanced_class_report_df = pd.DataFrame(balanced_class_report_dict).transpose()

# Visualize confusion matrix using seaborn
plt.figure(figsize=(10,7))
sns.heatmap(balanced_conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=['Non-Fraud', 'Fraud'], yticklabels=['Non-Fraud', 'Fraud'])
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

# Visualize the classification report
plt.figure(figsize=(10, 7))
sns.heatmap(balanced_class_report_df.iloc[:-1, :].T, annot=True, cmap='Blues', cbar=False, fmt='.2f')
plt.title('Classification Report')
plt.show()

# Other KPIs
print(f'Balanced Dataset Confusion Matrix:\n{balanced_conf_matrix}')
print(f'Balanced Dataset Accuracy: {balanced_accuracy}')
print(f'Balanced Dataset F1 Score: {balanced_f1}')
print(f'Balanced Dataset ROC AUC Score: {balanced_roc_auc}')

Confusion Matrix Test Dataset (Image by the Author)

Classification Report Test Dataset (Image by the Author)

Conclusion

Credit card fraud poses significant challenges, but with the power of machine learning, we can develop models to detect and prevent fraudulent transactions. By following the steps outlined in this blog, you can create a robust fraud detection model and evaluate its performance. While no model is perfect, continuously improving and adapting these models can help mitigate the risks and protect both consumers and financial institutions. Machine learning provides a promising avenue to enhance the security and reliability of credit card transactions, ensuring a safer financial future for everyone.

Code

credit_card_fraud.py