Leverage graph technology for real-time Fraud Detection and Prevention

Deepak Patankar
Booking.com Engineering
7 min read4 days ago

Written by Deepak Patankar and Mathijs de Jong

Introduction

At Booking.com we are dedicated to maintaining a secure and trustworthy platform for both our customers and partners. Our work involves addressing a multitude of threats, ranging from payment fraud to the proliferation of fake hotels and reviews, as well as the abuse of marketing campaigns. It’s only by effectively managing these challenges that we make it easier for everyone to experience the world.

Detecting and combating fraud presents a formidable challenge. Fraudsters continually evolve their tactics, demanding adaptable and agile fraud detection and prevention systems that immediately leverage newly available information. At Booking.com, we employ innovative and adaptable approaches such as machine learning models, manual investigations by fraud domain experts, and heuristic rules.

Problem statement

A key insight guiding our efforts is the interconnected nature of fraud attacks. Often, there are links between various actors, identifiers, and requests. For instance, knowing that an email address was previously associated with fraudulent activity provides valuable context for our systems. A natural way to represent such link data is through the mathematical concept of a graph. Recognizing the power of this interconnected data, we have therefore invested in a graph technology service. This service enables us to provide real-time link information crucial for effective fraud detection and prevention.

How do we represent requests in a graph?

We utilize historical data, such as reservation requests, to construct a graph representation. In this graph, nodes represent transaction identifiers like account number and credit card details, while edges connect identifiers that have previously been observed together. This data is stored within a graph database. When it’s time to assess fraud risk, we query the graph database to construct a local graph centered around the request identifier.

To illustrate, let’s explore how a graph evolves over time and how it can reveal suspicious patterns.

Time step 1

A reservation request comes in from ‘account 1’ with ‘credit card 1’. We create nodes for those entities, and an edge between them (Figure 1).

Figure 1: evolution of a graph over time; time step 1.

Time step 2

A reservation request comes in for ‘account 1’, but now with a different card: ‘credit card 2’. We add a new node for the new credit card, and create the relevant edge (Figure 2).

Figure 2: evolution of a graph over time; time step 2.

Time step 3

A reservation request comes in for ‘account 2’ with ‘credit card 2’. We add a new node for the new account, and create the relevant edge (Figure 3).

Figure 3: evolution of a graph over time; time step 3.

Time step 4

We received a notice from the issuing bank of ‘credit card 1’ that this card was used in a fraudulent way. We add a node that indicates the fraudulent status (Figure 4).

Figure 4: evolution of a graph over time; time step 4.

Time step 5

We receive a reservation request from ‘account 2’ with ‘credit card 3’, and are tasked with the fraud risk estimation (Figure 5). Though both these entities (in purple), ‘account 2’ and ‘credit card 3’, are not directly associated with previous fraudulent activity, only a few links away such activity has been observed (in red). It might indicate participation within a larger fraud scheme, or not. This information was surfaced by the graph service in real time, and can be used as one of the bits of information that goes into risk estimation. In absence of graph building, this type of information is difficult to surface.

Figure 5: evolution of a graph over time; time step 5.

In Figure 6 you see two different examples of real graphs generated by our graph service. Such a request is based on a given identifier (green star in the graph), which could for example be the account number used to make the reservation. Starting from this central identifier, the graph will include connected identifiers. For the request on the left, the central identifier was connected to only two other identifiers previously observed on our platform (blue), which were not associated with any other identifier. For the request on the right, the identifier was connected to many other identifiers, resulting in a large and deep graph.

The shape and size of these graphs can reveal information about the likelihood that the given request is fraudulent or not. In this particular instance, it turned out later that the request that generated the small graph on the left was a non-fraudulent request, whereas the request that generated the large graph on the right was a fraudulent request. This illustrates that the size and shape of the graph can contribute to the fraud risk estimation.

Figure 6: two examples of real graphs generated by the graph service.

The logical next question is: how do we represent (and quantify) the structures within graphs, such that this information can be used by our fraud detection systems?

How do we use these graphs ?

One straightforward way to represent a graph is to craft a set of descriptive features. A very simple feature would be the counts of the different types of nodes. For the fictitious graph in Figure 5, this would result in the features count_accounts=2, count_cards=3 and count_fraud=1. Another important feature is hops_to_fraud, which provides the shortest distance to a nearby fraud notice node in the graph (if any) from the central node of the request. For the graph in Figure 5, the value of this feature from the request nodes (in pink) would be hops_to_fraud=4. Beyond these basic features, more advanced features can uncover more nuanced properties of a graph, and can therefore uncover more advanced fraudster behaviors.

The values of these descriptive features are informative for estimating fraud risk. You can, for example, imagine that a low number of hops to a fraud notice could indicate a high risk of fraud. The exact relation between the feature values and fraud risk is outside of the scope of the graph service, and is typically learned by a machine learning model, or defined by a fraud domain expert.

How do we do real-time fraud detection?

One technical requirement for using graphs in fraud detection is to generate these graphs for a real-time system that screens all incoming requests for potential fraud. The fraud detection happens synchronously during request processing, hence a low latency is required. Additionally, building these graphs involves leveraging a very large database of historically observed identifiers, which introduces further technical complexities and makes it challenging to maintain low latency. With some optimizations, and database indexing, we were able to achieve p99 response time of 300 milliseconds.

The below diagram describes the system components that are part of serving these graphs in real time. The key components of our system are the Graph Service, the Fraud Detection Service, and JanusGraph. We use JanusGraph in combination with a Cassandra backend to store the graph data.

Figure 7: an overview of the Fraud Detection Service and Graph Service design.

At the time of request, for example a reservation request, our Fraud Detection Service receives a call to do a fraud prediction. The Fraud Detection Service calls the Graph Service with the relevant information for graph building.

The first step for the Graph Service is to insert the identifiers of the request as nodes into the graph database, if they didn’t exist yet, and to add an edge between these identifiers, if it didn’t exist yet. The second step is to perform a breadth-first search in JanusGraph to fetch the network of identifiers that are connected to the identifiers of the request. Lastly, we use this graph of connected identifiers to calculate the graph features and return it as the call response to the Fraud Detection Service. The Fraud Detection Service uses these graph feature values to predict whether a request is fraudulent or not.

Conclusion

In this blog we have presented the relevance of graph technology for fraud detection and prevention. We presented conceptually how graphs can surface information that is difficult to represent otherwise. In addition, we have shown how we have built a real-time graph service that has been providing valuable graph information to our machine learning models and fraud experts for years. This work is instrumental in providing a secure and trustworthy platform for our customers and partners.

--

--