Protect Your App from Frauds using Amazon Fraud Detector

Published in

AVM Consulting Blog

6 min readAug 10, 2020

Most internet applications available today supports self signup where the users can sign themselves up and start using the application without any other human involvement. Every thing is automated and handled quitely and nicely without much effort.

However, once an application is enabled for general public, it is exposed to both legitimate and fraudulent users. The very basic way to distinguish between these two types is adding verification steps while signing up to verify the authentication parameters provided (like email). But having verification steps may be enough at the early stages of an application and will sooner be not enough since nowadays faking an email that can be verified is an easy thing. There are several disposable email id providers, so creating an email with an inbox is just a single click (sometimes no clicks required at all). And the list of these providers getting bigger each day making it so hard to keep updating our blacklists to filter email ids and reject signup requests.

So we need a way to proactively detect fraudulent users and keep up to date with the new threats. This is where machine learning comes in handy. Using machine learning in combination with our past experience of the users who signed up we can build a model that can detect fraudulent users. This sounds simple, but the implementation will be complex if we are building it on our own.

Amazon Fraud Detector simplifies the hazzle of implementing fraud detection mechanisms for applications. We can do the model-building part from the console itself and then integrating it into our application is just a matter of calling an API endpoint.

Let’s see a sample implementation of a fraudulent email id detector that uses a machine learning model and how we can use it in our application. However, note that using a machine learning model is optional in Amazon Fraud Dector and it allows to create detectors with rules that evaluate against static data as well.

So let’s build our fraud detector…

Create S3 bucket

First we have to create a bucket to store our training dataset.

Upload history email ids list

We can use our history email ids list as the training data. However, note that this dataset should,

Have more than 10,000 data rows
Both fraudulent and legitimate events exceed the minimum count of 500
Have header names that have no spaces and contain only a-z lower case characters (underscores are allowed). And also contains the mandatory names EVENT_TIMESTAMP(event occurred time) and EVENT_LABEL(event classification)

Create Event Type

Give a name for your Event. In our case we can name it as user_registration (An event is a business activity that is evaluated for the fraud risk)
Create an Entity (An entity represents who is performing the event, and in our case it’s the user)

3. Choose how you are defining the Event Variables.

Since we are using a training dataset we can select Select variables from a training dataset option

4. Create an IAM Role and upload training dataset

IAM role is used to access the training data in the S3 bucket we created above.

We can create new role from here itself just by providing the name of the S3 bucket

Next provide the S3 training data file location and click Upload to get the headers of the csv file.

Then map the variables with the type from the list.

5. Create labels to categorize the event. (A label classifies the event as fraudulent or legitimate)

In this case we need two labels as legit and fraud.

Repeat above step to create the fraud label as well.

Create the Model

A model can be thought as a formula that takes some input parameters and returns a computed result which can be used in an expression for decision making. Here we have to select a model type as well, and that defines the algorithms, enrichments and feature transformations used during the model training.

Next we have to define the labels that are used for classification.

Once we finish the above, the model training would begin…

Once the training is completed, click on the version to view the performance of the trained model.

Then click on Actions -> Deploy model version to deploy the trained model. It would take few minutes to complete the deployment.

Create the Detector

A detector contains the detection logic that we define using models and rules to evalute for fraudulent or legitimate users.

Then add the model version we deployed.

Next, we have to create a rule with one or more outcomes. A rule is a condition that defines how to derive the variables during the prediction and the outcome is the result from the prediction. We can use the variables we created and also the variables that are returned from the model.

In our case we are using the variable returned from the model to define the condition and create a new outcome named high_risk_outcome.

Follow the same steps to create another rule named low_fraud_risk with an outcome named low_risk_outcome with the following expression.

$user_registration_fraud_model_insightscore <= 500

Next select the rule execution mode as below.

Integrate the Detector

Finally, let’s use our fraud detector in our application. Mentioned below is how we can use it within a NodeJS application. We can use the getEventPrediction function from the FraudDetector service in the SDK for this.

Use Fraud Detector in NodeJS Application

We now have a machine learning-based fraud detector for our application. Super cool huh! 😎. AWS has brought machine learning close to the developers so that even without much knowledge about machine learning, we can use it for our applications. I must admit that this type of intelligence was a nightmare for regular web and mobile applications since deploying a service like this would cost us more than running the application.