Credit Card Fraud Detection using Keras and R

Dimitris Sykas
Sep 13, 2019

Now days almost everyone is using credit and debit cards to perform purchases from physical and online shops. Billions of Euros, USD, Yen, etc are being transferred in order to buy goods and services every day.
So, no need clarify why the detection of fraud transaction is of paramount important for the transaction providers, the customers and the sellers.
Fraudulent transaction or fraud transaction detection is defined as:

A fraudulent transaction is one unauthorized by the credit card holder. Such transactions are categorized as lost, stolen, not received, issued on a fraudulent application, counterfeit, fraudulent processing of transactions, account takeover or other fraudulent conditions as defined by the card company or the member company.

In this article I’m going to walk you through building and validating models for detecting credit card fraud transactions. The open dataset that I used is from the “Machine Learning Group -ULB” and is hosted in kaggle.

The Dataset

The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

It contains only numeric input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are ‘Time’ and ‘Amount’. Feature ‘Time’ contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature ‘Amount’ is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature ‘Class’ is the response variable and it takes value 1 in case of fraud and 0 otherwise.

Exploring the dataset

Now lets load the csv file using the read.csv function in R

Exploring the dataset with ggplot_shiny(creditcard)

As you can see it is quite hard to simply distinguish fraudulent from non fraudulent transactions. In general, it is always a very good idea, before applying any fancy and complex classification and regression algorithm, to explore a bit and understand your data! A very useful tool is the ggplotgui package.

Getting Started with Keras in R

Most probably (since you are reading this article) you know that Keras is a high-level neural network API. Key feature of Keras is that it has a very fast learning curve and it is actually fast to run!

So, in RStudio just copy-paste the following code to install Keras and its interface with R

From the term “fraudulent transaction detection” we understand that transactions can be fraudulent or not. Do you hear any bell ringing (metaphorically)? Hope so, because this is a typical binary classification scenario. You have a transaction and you want to classify it either as fraudulent (1) or non-fraudulent (0).

Initially less split the dataset into two parts:
1. Training
2. Test

The first set of data will be used to train the prediction model, while the second set of data will be used to test the result of the prediction model.
These datasets do not have any intersection, i.e. the training data are only used in the training process, while the test data only for the training process.
So let’s do it in R!
We are using the caret package and the “createDataPartition” function. Using the “Class” column from our data we create the indexes to subset (of better split) initial dataset to the two parts.

We’ve created an index that will use 70% of the data on training and the other 30% for a test set.

So next step is to prepare these two parts for Keras. As you see in the code bellow, we also scale the data. This is very important to do, because it helps numerically the neural network to calculate weights that will better fit our data.

Define the Keras model

For this purpose we are using the “keras_model_sequential()” function to initialise the model. Then we are defining dense layers using the relu activation function. In between the dense layers we also add drop-out layers.
This is done in order to avoid overfitting.

Compile the model

Now we compile the model using the “binary_crossentropy” loss function. We are using this specific function since we have a binary classification problem. We are using the adam optimiser for gradient descent and using accuracy for the metrics. Finally we fit our model to the training and testing datasets. We also define that our model will run for 100 epochs using a batch size of 5 and a 30% validation split.

Now let’s see all these in R code:

And now let’s run the training!

You will notice that during the training you will get history graph that shows the progress of the training.

val_loss and val_acc are the value of cost function for your cross-validation data and loss and acc are the value of cost function for your training data.

Now let’s check out the summary of our model!

Cool? Now it’s time for the truth, let’s evaluate our model with the test dataset

From the resulting table we see that the error of classing a non-fraudulent transaction as fraudulent is only 9/85288 (0.01%), while the error classifying a fraudulent transaction as a non-fraudulent is 36/109 (33%)!

This kind of imbalanced errors makes sense taking into account how imbalanced the dataset is (85297 non-fraudulent vs 145 fraudulent transactions) and the way we splitted the dataset. In order to improve our accuracy, we need to sample more data classified as fraudulent and use less data classified as non-fraudulent. Also by fine tuning the parameters and structure of the neural network, the accuracy will increase.

And here you can download all the code that I used for this project

And here you can find and fork the Notebook in kaggle



Dimitris Sykas
