Credit Card Fraud Detection With Machine Learning

Fabio Rodrigues
Analytics Vidhya
Published in
7 min readMay 21, 2021

Although the new payment formats, the credit card still holds the major part of money transactions. Due to the online shopping growth, transactions using credit cards have been more frequent. Which in turn helps in practicality and speed of payments it also attracts criminal attention. We can summarize these occurrences as the improper use of credit/debit cards to make fraudulent transactions, obtaining money and goods. Where the numbers are taken by websites without protection, malicious credit machines, and numbers shared using social media apps.

To avoid losses and also the protection of their clients, credit card companies had been invested in technologies to avoid and reduce this kind of occurrence, the use of machine learning models has been shown as a powerful tool for fraud detection, once it uses a large quantity of data collected in the past to verify and classify the new transactions in real-time.

In this article, we will implement a machine learning model using python, pandas, and scikit-learn to grab a transaction database and train the model to be able to classify new transactions as fraudulent or not.

About the Dataset

To make the analysis, we will use a database from two days of bank transactions realized in September 2013 by European cardholders. The dataset is available for download in Kaggle has about 285 thousand transactions, where 492 operations have been classified as scams or fraud. Due to the confidentiality issues, the personal data for the customers have been anonymized, the variables renamed from V1 through V28, and the data transformed using (Principal Component Analysis — PCA), which in this case has been used to reduce the dimensionality of the numerical values, below there are described the only variables without changes.

  • Time - Contains the seconds elapsed between each transaction;
  • Amount - Total transactioned value;
  • Class - Label given from transactions, where 0 represent a normal transaction and 1 reffers to a fraudulent transaction.
Libraries used

Exploratory Analysis

In this section, we’ll make a preliminary data analysis, to verify the variables in the dataset, null values, outliers, and histograms of legal and illegal transactions.

Data Frame Header

Null values Checking:

print('Total of Null Values:', df.isnull().sum().max())Total of Null Values: 0

Balance between legal and illegal transactions:

fig, ax = plt.subplots(figsize=(5,5))
sns.countplot(x='Class',data=df);
print(pd.Series(df.Class).value_counts())
plt.show()
0 284315
1 492
Legal and Illegal Transactions, Image by Author

There’s a considerable discrepancy between the legal [0] transactions and illegal [1] occurrences. Due to this, the dataset will be balanced before the setup of the machine learning model.

Checking Outliers in ‘Amount’ values:

fig, ax = plt.subplots(figsize=(15, 3))
sns.boxplot(x='Amount', data=df)
plt.show()
Boxplot for ‘Amount’ values, Image by Author

As we can see in the previous image, there’s a lot of outliers in the Amount column, we will perform a cleaning of these values and redo the plot to check how it goes.

# checking outlier values 
q1_amount = df.Amount.quantile(.25)
q3_amount = df.Amount.quantile(.75)
IQR_amount = q3_amount - q1_amount
print('IQR: ', IQR_amount)
# defining limits
sup_amount = q3_amount + 1.5 * IQR_amount
inf_amount = q1_amount - 1.5 * IQR_amount
print('Upper limit: ', sup_amount)
print('Lower limit: ', inf_amount)
IQR: 71.565
Upper limit: 184.5125
Lower limit: -101.7475

After defining the limits for the ‘Amount’ column, we will clean the outliers and redo the boxplot.

# cleaning the outliers in `Amount` values
df_clean = df.copy()
df_clean.drop(df_clean[df_clean.Amount>184.49].index, axis=0, inplace=True)
# new boxplot for `Amount` values
fig, ax = plt.subplots(figsize=(15, 3))
sns.boxplot(x='Amount', data=df_clean)
plt.show()
New boxplot for ‘Amount’ values, Image by Author

After cleaning the outliers from the dataset, the values in the Amount column are less discrepant and the boxplot easier to read.

Transactions Histograms

# legal transactions by time
fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(x=(df_clean.Time[df_clean.Class==0]), bins=50);
ax.set_title('Legal Transactions by Time')
plt.show()
Image by Author
# legal transactions by amount
fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(x=(df_clean.Amount[df_clean.Class==0]), bins=50)
ax.set_title('Legal Transactions by Amount')
plt.show()
Image by Author
# illegal transactions by time
fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(x=(df_clean.Time[df_clean.Class==1]), bins=50)
ax.set_title('Illegal Transactions by Time')
plt.show()
Image by Author
# illegal transactions by amount
fig, ax = plt.subplots(figsize=(10, 5))
sns.histplot(x=(df_clean.Amount[df_clean.Class==1]), bins=50)
ax.set_title('Illegal Transactions by Amount')
plt.show()
Image by Author

Pre-Processing Data

Before we create the machine learning model, we will need to make some adjustments to the dataset values. The steps are listed below:

  • Standardize the Time and Amount values;
  • Divide the dataset into train and test;
  • Balance the values.
# importing scikit-learn modulesimport scikitplot as skplt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score
from imblearn.under_sampling import RandomUnderSampler

Standardize Process

The Standardize process, consists in make transformations in the data structure, minimize the redundant values and remove the mean and scaling to unit variance.

# creating a copy of the original dataset
df_new = df_clean.copy()
# standardize data
scaler = StandardScaler()
df_new['Amount'] = scaler.fit_transform(df_new.Amount.values.reshape(-1, 1))
df_new['Time'] = scaler.fit_transform(df_new.Time.values.reshape(-1, 1))
# check the standardized data
df_new.head()
Dataset after Standardize Process

Dividing the dataset in train and test

The train/test split it’s an important step in machine learning process. Where the model is split into two sets: training and testing set. Usually, the major part of the dataset is reserved for the training and a small part for testing. The train data will be used to create the machine learning model, and the test data is used to check the accuracy of the model.

# train and test data
X = df_new.drop('Class', axis=1)
y = df_new['Class']
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=True, stratify=y)

Balancing dataset

As we can see in the exploratory analysis, we have a discrepancy between the number of legal transactions and frauds. To feed the machine learning model without any bias. We will perform the balance of the dataset to keep the values as close as possible.

# instantiating the random undersampler
rus = RandomUnderSampler()
# resampling X, y
X_rus, y_rus = rus.fit_sample(X_train, y_train)
# new class distribution
print(pd.Series(y_rus).value_counts())
sns.countplot(y_rus)
plt.show()
1 301
0 301
‘Class’ column values after balancing, Image by Author

Now, with the balanced dataset, we can proceed with the setup of the machine learning model.

Machine Learning Model

After pre-processing the data, we can create the machine learning model, as we are dealing with a binary problem, we will use the logistic regression to check if the transactions can be labeled as legal or scam.

# setup the machine learning model:
np.random.seed(2)
model = LogisticRegression(C=0.01)
model.fit(X_rus, y_rus, sample_weight=True)

With the machine learning model ready, we can check the model predictions using test data.

# model predictions:
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)

Model Performance

In the report below, we’ve some metrics to check the model performance, let’s make a brief explanation about how to understand those values, and evaluate the machine learning model. Before explaining the mathematical formulas, we will explain some terms and what it does represent.

  • TN — True Negative: when a case was negative and predicted negative;
  • TP — True Positive: when a case was positive and predicted positive;
  • FN — False Negative: when a case was positive but predicted negative;
  • FP — False Positive: when a case was negative but predicted positive.

Accuracy — What percent of predictions the model did correctly?

Precision — What percent of the predictions were correct?

Recall — What percent of the positive cases the model catch?

F1-Score — What percent of positive predictions were correct?

# classification report
skplt.metrics.plot_confusion_matrix(y_test, y_pred, normalize=True, text_fontsize='small', title_fontsize='medium', cmap='Blues');
print("Classification Report:\n", classification_report(y_test, y_pred, digits=3))
Classification Report:
precision recall f1-score support

0 1.000 0.990 0.995 63123
1 0.130 0.920 0.228 100

accuracy 0.990 63223
macro avg 0.565 0.955 0.611 63223
weighted avg 0.998 0.990 0.994 63223
Image by Author

Conclusion

The use of technologies in credit card fraud detection using machine learning models, shows a powerful tool, once the models receive huge quantities of new data every day. Although we have reached good results in the model, it’s important to make tests in new databases, to observe and improve the model performance.

Thanks For Reading!

Thanks for reading. Send me your thoughts and ideas. You can write just to say hello. And if you really need to tell me how I got it wrong, I look forward to chatting soon. The complete article can be accessed by the following link

LinkedIn: Fábio Rodrigues |Github: fabiodotcom

--

--

Fabio Rodrigues
Analytics Vidhya

Undergraduated in Civil Engineering from PUC Minas, on the Poços de Caldas campus.