How to detect fraud in your B2B payment platform
What having ‘startup data’ can teach you about Machine Learning for fraud detection
Building a fraud detection model from scratch is far from being straightforward. Many challenges should be tackled along the way particularly in the first weeks of implementation.
This is the reason why we wanted to share the main components of an alerting system for the detection of potential fraudulent transactions.
Build a more secure environment using data
Libeo is a leading Business-to-Business (B2B) payments platform in Europe. Naturally, one of our main concerns is to develop secure end-to-end payment operations, especially considering that Libeo deals with a large number of payments every day and processes sensitive data. Similarly, the customer’s minimum requirement when they join the platform is to make safe transactions.
Hence, in addition to traditional security measures such as secure gateways and the two-factor authentication that constitute the basic technical requirements for any payment platform, we worked on a more advanced data-based solution.
As a startup, you don’t have historical data, even less historical fraud data (unless you’re really unlucky!). Therefore, you can rarely expect to have lots of historical fraud data to explore. Long story short — you need to get creative!
Build your fraud team
Like any other business use case, this challenge cannot be tackled by data team members alone. It is important to build a fraud team from day one with a rather heterogeneous composition. Libeo’s fraud team is composed of two data analysts, a data scientist, a product manager, and two customer success managers.
First, as we had no prior fraud case to base our alert system on, we had to imagine different fraud scenarios and group them into different types of risks based on expertise and market practices. Later on, we verified which types of risks could be identified based on data collected on a daily basis. This exercise allowed us to anticipate what new data we need to track and explore in order to cover more risks. The product manager’s essential role is to challenge whether or not the considered properties cover each corresponding risk. The product manager also examined adequate preventive actions for every type of risk. Last but not least, the customer success managers, who are more customer-facing, played a central role in the testing phase and usage of the model. More on that below.
A significant advantage of having a heterogeneous fraud team is that you get both the data and the customer-facing teams involved. Every team’s representative has a responsibility in understanding how it’s done and should play their role to ensure that we inhibit fraudulent transactions.
Let’s get a bit technical.
Build solid assumptions
At Libeo we started by building a score-based alerting system. Its job is to automatically assess the level of suspiciousness of every single transaction and flag the ones whose scores exceed a certain threshold.
A transaction is never instantaneous. Even if the service says it is, it’s not! The transaction takes a certain amount of time to be processed and that constitutes our only window to react. The challenge here is to find the means to stay as proactive as possible in order to detect and block potential frauds without prolonging the payment request processing time.
We based the model on the following assumptions:
- Ensure that if for any reason it misses a fraudulent transaction it must learn from it afterward and become more severe with similar activity
- Increase vigilance on new users
- Need to earn the trust of the model through a positive history of transactions checkmarks
- Beware that not every new or suspicious behavior is malicious, however, almost every fraudulent transaction has to be recognized as suspicious somehow (more on this one later)
Let’s zoom in on the scoring system.
Scoring system
All of the previously fixed assumptions have one thing in common: they all require a reference on which to base the assessment and the scoring.
- Build a set of nominal behaviors enabling the recognition of a new user that is starting to pay, a positive history of transactions, a known fraud, and a suspicious activity
- Determine a set of interactive comparison rules in order to evaluate each transaction by comparing it to nominal behavior. Prescriptive rules (pure logic) add argumentation to the model. It is an important component — it enables us to better deal with scenarios that have not yet occurred in the past
- Discern a new or odd behavior from a malicious one, not only to reduce false positives in the alerting system but also to make the scoring refinement process easier. However, remember that it is way better to have false alarms about secure transactions rather than having no alarms on fraudulent transactions. Bottom line: it is ok to be strict on suspicious behavior in the beginning
- Establish the difference between detecting/alerting and taking preventive action. It is essential to understand the different workflows and consequences
It is key for us at Libeo that the attributes (or properties) we use are actionable whenever we are building a data-driven tool. We always ask ourselves if we can transform this insight into action.
It is a known fact that it’s pretty dangerous to automate any action because we risk blocking a lot of non-fraudulent transactions or spamming our users with alarming emails. So, the decision to take any action or not is always one for the fraud team. However, it is important to help decision-making by designing a catalog for mapping recommended preventive actions to each specific fraud scenario.
Micro iterations for continuous improvement
Learning by doing is a core value at Libeo and it has been a key enabler of our strong growth on all levels. This project was no exception. We put the model in production and:
- Iterated on a daily basis. It is not at all tedious if you react fast to make the adjustments! All we needed is a shared document and 15 minutes of our time every day to write our observations.
- Refined the score in line with everybody’s feedback especially the customer success managers since they know the customers as moderators.
- Built on observations over time to validate or reject initial assumptions. We always kept in mind that everything could be challenged at any moment and anything could be changed if it was proven wrong with time.
The end result
If you are a data scientist like me, you can be amazed by what a scoring system can offer:
- For our users, a great B2B payment experience, secured and smooth!
- For the business, an alerting system that keeps us sharp
- For our Data team (especially the Machine Learning savvies), a record of historical transactions with their corresponding scoring, type of risk and potential preventive action. This dataset is used to train an ML model that predicts the type of risk on a suspicious transaction and recommends an appropriate preventive action.
At Libeo, it is not Data Science if you don’t have a production-first approach, if you don’t engineer everything from scratch, and you don’t base your mindset on short iteration cycles!
If you’re interested in discussing data topics with us, drop us an email at data@libeo.io. Visit our website libeo.io and follow Tales of Libeo where we will be sharing more insights.