Riskified Identity Resolution Engine: From Need to Scalable Product

Published in

Riskified Tech

6 min readJan 24, 2024

Merchants were selling products long before the Internet revolution. The internet enabled the evolution of traditional merchants to use eCommerce to reach more customers and extend their business, but every opportunity comes with a price.

The traditional risks (such as shoplifting) evolved, as fraudsters or legit customers could abuse merchants’ platforms or policies to gain money or more significant discounts, such as overuse of coupons or customers reporting item-not-received although they did receive it.

The internet makes it harder to detect these people because it’s easy to create a different online persona and continue their work even if they get caught. To overcome this, there rose a need to find a way to detect the person behind several users (aka the identity).

To cope with this problem, we built a system that enables real-time identity resolution. The system aims to cluster accounts across Riskified’s networks to create a single identity representation for various internal and merchant-facing interfaces.

It is a platform that needs to solve multiple use cases, which causes analytic and technical challenges as the use cases continuously grow, and each has different technical needs.

In this blog post, I’ll describe the development process of Riskified Identity Resolution Engine from the idea phase to the technical aspect and the reach of a scalable product. I’ll also elaborate on the system architecture and concepts we adopted that helped us develop the system faster and maintain it better.

You can expect to see how we leverage the capabilities of a big company to develop a new product at a start-up pace. The priorities of providing value as soon as possible how it helped us specify the product, and some technical details of this engine.

Step 1: From a problem to a product definition

A product will succeed only if it solves a real problem, but sometimes the solution is not straightforward, or the described problem hides the “real” one under the surface. In our case, the need was raised by our merchants.

Merchants started approaching Riskified with problems, such as Liar Buyer, for instance, that seemed tangential to the ones we already helped them with (e.g., preventing fraud, the Chargeback Guarantee solution).

We tried to understand why merchants couldn’t solve these problems by themselves. We found that they can’t truly get a view of a buyer’s complete purchase and claim history due to the buyer either obfuscating their identity by opening multiple accounts or using different buying methods such as phone or physical store orders. We realized we could help them.

The Chargeback Guarantee is based on the Linking capability developed by Riskified. This technology is designed to answer questions about orders, while those problems require answering questions about the identity behind the order.

Riskified has huge advantages that enable us to detect the real identity of the buyer: domain experts, a lot of data from its merchants’ network, and the technology to leverage this. To help solve this problem, we decided to develop the Riskified Identity Resolution Engine.

An identity that its accounts are connected using Riskified’s network

Step 2: MVP

Minimal Viable Product is a crucial step toward validating the business hypothesis and enabling us to get feedback, accurately pinpoint the needs, and provide value as soon as possible.

Phase 1

To solve the identity problem, the Data Science team developed an algorithm to trace identities and a lean script to test it. The first algorithm was based on the existing Riskified linking technology. This way, we could rely on existing knowledge and save time in development and research.

The development team wrapped it and made it suitable for production use. Parallel to that, we developed a live system to compute the identities.

Phase 2

In the live system development, keeping in mind this was an MVP, we decided to use DynamoDB to store the identities. We chose it mainly because it was accessible and widely used at Riskified. That way, we could leverage the advantage of being a part of a big company, enabling quick development time, which helped us detect the limitations of the solution faster.

We saw that the fact identities would only grow, making us miss new information that could make identities smaller and lead to more accurate identities. In addition, production-wise, it was hard to manage those large entities.

Phase 3

To handle the production issue, we switched to a Spark job. That helped us keep the MVP usable, but the identities were not updated in real time. It was hard to research those identities, and they were less accurate.

Conclusions

Thanks to the MVP, we gained confidence in the necessity and capability of the solution. We made the algorithm more accurate and discovered potential pain points, such as the need for the system to be able to correct itself. The accounts in the identity could be changed as we gather more information, for instance, one identity can split into two.

Keeping that in mind, the time to develop a live scalable system had come.

Step 3: Scale

At this point, clients were using our product and saw value. The data behaves as a graph, with accounts and data points as nodes and relations between them.

Since it was a graph problem with a graph algorithm that solved it, it made sense to use GraphDB. Riskified didn’t have suitable technology at that time, so we were the company’s first adopters.

We designed the solution carefully, with a lot of iterations, so we could ensure we had created a system that could be maintained, changed, and extended with minimal work. We gave extra attention to the pain points the MVP revealed.

Phase 1

The computation was performed using the GraphDB, and we “cached” it with a relational DB to meet quick response times, on the one hand, and left the GraphDB for the computation layer on the other hand.

We used Data Flow architecture, and the design was led by two main concepts:

Each processing unit (consumer) had one responsibility
The data has one flow direction — which means the system was organized as a directed acyclic graph (DAG).

The design aimed to support the software life cycle, including the stages of idea generation, design, development, maintenance, and extension. The concepts above had significant contributions to it.

Assigning each consumer a single responsibility had several advantages, including the ability to more easily locate bugs and to parallelize development with minimal conflicts.

However, it also had some drawbacks, such as the need for more deployments and the potential for synchronicity issues, which we invested extra thought into to avoid.

The use of a DAG to organize the flow of events in the system also had several benefits. It made the system easier to extend and allowed the creation of different profiles using the same path, reducing the amount of code that needed to be maintained.

Additionally, it facilitated the process of locating bugs and made end-to-end testing easier. However, this approach did have the potential to cause events to take longer to reach their endpoints.

Overall, we’ve got a system with concurrency and high throughput for data processing, low coupling between components, simplified system maintenance, and code reusability.

Phase 2

These days, we focus our efforts on incorporating immediate data received via API request into the identity resolution, which means creating identities using both historical data (up to 1 minute refresh time) and the received information to accurate our identities.

In addition, we are continually working on scale and analytical improvements.

What we accomplished and how to replicate the success

Riskified Identity Resolution Engine enables us to uncover a customer’s true identity representation despite all attempts to disguise it. These identities are the building blocks for any policy protection enforcement, refund abuse, return abuse, reseller abuse, item limit abuse, and more.

Based on the Identity Resolution Engine, we also developed Identity Explore, an application that makes these identities accessible to merchants, giving them a holistic view of their customers.

Going forward, we are focusing on immediate response, scale improvement, cost reduction, and expansion of solutions and use cases.

Working lean, being committed to resilient design, and understanding the core of the problem you are aiming to solve can help you save development time and create an extendable product with product-market fit and growth potential.

Riskified Identity Resolution Engine: From Need to Scalable Product

Step 1: From a problem to a product definition

Step 2: MVP

Phase 1

Phase 2

Phase 3

Conclusions

Step 3: Scale

Phase 1

Phase 2

What we accomplished and how to replicate the success

Written by Ofir Shechter