Protect Your App from Frauds using Amazon Fraud Detector

Asanka Nissanka
AVM Consulting Blog
6 min readAug 10, 2020
Photo by Jefferson Santos on Unsplash

Most internet applications available today supports self signup where the users can sign themselves up and start using the application without any other human involvement. Every thing is automated and handled quitely and nicely without much effort.

However, once an application is enabled for general public, it is exposed to both legitimate and fraudulent users. The very basic way to distinguish between these two types is adding verification steps while signing up to verify the authentication parameters provided (like email). But having verification steps may be enough at the early stages of an application and will sooner be not enough since nowadays faking an email that can be verified is an easy thing. There are several disposable email id providers, so creating an email with an inbox is just a single click (sometimes no clicks required at all). And the list of these providers getting bigger each day making it so hard to keep updating our blacklists to filter email ids and reject signup requests.

So we need a way to proactively detect fraudulent users and keep up to date with the new threats. This is where machine learning comes in handy. Using machine learning in combination with our past experience of the users who signed up we can build a model that can detect fraudulent users. This sounds simple, but the implementation will be complex if we are building it on our own.

Amazon Fraud Detector simplifies the hazzle of implementing fraud detection mechanisms for applications. We can do the model-building part from the console itself and then integrating it into our application is just a matter of calling an API endpoint.

How is it used

Let’s see a sample implementation of a fraudulent email id detector that uses a machine learning model and how we can use it in our application. However, note that using a machine learning model is optional in Amazon Fraud Dector and it allows to create detectors with rules that evaluate against static data as well.

So let’s build our fraud detector…

Create S3 bucket

First we have to create a bucket to store our training dataset.

Bucket Creation

Upload history email ids list

We can use our history email ids list as the training data. However, note that this dataset should,

  • Have more than 10,000 data rows
  • Both fraudulent and legitimate events exceed the minimum count of 500
  • Have header names that have no spaces and contain only a-z lower case characters (underscores are allowed). And also contains the mandatory names EVENT_TIMESTAMP(event occurred time) and EVENT_LABEL(event classification)
Training Data

Create Event Type

  1. Give a name for your Event. In our case we can name it as user_registration (An event is a business activity that is evaluated for the fraud risk)
  2. Create an Entity (An entity represents who is performing the event, and in our case it’s the user)
Entity Creation

3. Choose how you are defining the Event Variables.

Since we are using a training dataset we can select Select variables from a training dataset option

4. Create an IAM Role and upload training dataset

IAM role is used to access the training data in the S3 bucket we created above.

We can create new role from here itself just by providing the name of the S3 bucket

IAM Role Creation

Next provide the S3 training data file location and click Upload to get the headers of the csv file.

Then map the variables with the type from the list.

Variable Mapping

5. Create labels to categorize the event. (A label classifies the event as fraudulent or legitimate)

In this case we need two labels as legit and fraud.

Legit Label Creation

Repeat above step to create the fraud label as well.

Create the Model

A model can be thought as a formula that takes some input parameters and returns a computed result which can be used in an expression for decision making. Here we have to select a model type as well, and that defines the algorithms, enrichments and feature transformations used during the model training.

Model Details

Next we have to define the labels that are used for classification.

Model Inputs

Once we finish the above, the model training would begin…

Model Training

Once the training is completed, click on the version to view the performance of the trained model.

Model Performance

Then click on Actions -> Deploy model version to deploy the trained model. It would take few minutes to complete the deployment.

Model Deployment
Model Deployment In Progress

Create the Detector

A detector contains the detection logic that we define using models and rules to evalute for fraudulent or legitimate users.

Detector Details

Then add the model version we deployed.

Detector Model

Next, we have to create a rule with one or more outcomes. A rule is a condition that defines how to derive the variables during the prediction and the outcome is the result from the prediction. We can use the variables we created and also the variables that are returned from the model.

In our case we are using the variable returned from the model to define the condition and create a new outcome named high_risk_outcome.

Detector Rule for High Risk

Follow the same steps to create another rule named low_fraud_risk with an outcome named low_risk_outcome with the following expression.

$user_registration_fraud_model_insightscore <= 500

Next select the rule execution mode as below.

Detector Rule Execution Mode

Integrate the Detector

Finally, let’s use our fraud detector in our application. Mentioned below is how we can use it within a NodeJS application. We can use the getEventPrediction function from the FraudDetector service in the SDK for this.

Use Fraud Detector in NodeJS Application

We now have a machine learning-based fraud detector for our application. Super cool huh! 😎. AWS has brought machine learning close to the developers so that even without much knowledge about machine learning, we can use it for our applications. I must admit that this type of intelligence was a nightmare for regular web and mobile applications since deploying a service like this would cost us more than running the application.

👋 Join us today !!

️Follow us on LinkedIn, Twitter, Facebook, and Instagram

If this post was helpful, please click the clap 👏 button below a few times to show your support! ⬇

--

--