How we secure your sensitive data at Feedzai

Nuno Diegues
Feedzai Techblog
Published in
6 min readMar 20, 2019

The daily life of Feedzai systems is to process payments, transfers, account openings, i.e., a myriad of financial operations in which we detect risk and provide valuable feedback to our clients in real-time. That means some of those activities (such as payment transactions) may be rejected immediately, whereas others may be sent for review by a financial analyst. Regardless, that processing entails that our systems have to deal with Cardholder Data and other types of Personally Identifiable Information (PII).

Have you ever wondered what prevents just anyone from taking a peek at your card number and using it to get their next Christmas gift? Actually, there’s a lot in place to prevent that!

Why must we store sensitive data at all?

The life of a transaction across our systems is a quick, but nevertheless interesting one:

  • Real-time events are processed in a streaming fashion, with the objective of updating profiles that characterize that payment / client, and to keep learning about the pattern of usage.
  • To power our machine learning predictive system, we use the history of past events (that have been through the system in real-time) to create profiles on user behaviour. Therefore each new event that comes through our systems, enriches that past history, and uses that knowledge to be augmented with more information and serve as features for our machine learning models.
  • Feedback is served in real-time to the client and forwarded to Feedzai’s Case Manager for analysts to review and investigate further.

Those profiles are the crux of this story as more often than not they depend on sensitive data (e.g., the amount of money used by a card in the past hours is a useful profile). Therefore, in order to protect all those payments from fraud, we must actually store information that depends on sensitive data.

Hashing: a possible hero?

A simple way to solve this problem is to translate all sensitive data to something meaningless but still useful. This is possible through hashing, which deterministically maps each input to a possible value in a closed domain. E.g., your card number could be hashed into the value ‘9idasdij2iu39u’, which is no longer useful for a fraudster.

That value would still be useful to Feedzai systems:

  • Every time your card entered the system in a new event, it would be deterministically hashed into that value.
  • Profiles built over your card would be indexed by the hash, and not the actual card number.

However, the big disadvantage of this approach is that hashing is a one-way function that cannot be reversed. That would be fine, except for one very particular use case: analysts investigating cases based on Feedzai’s risk score need to see the sensitive data in order to check the legitimacy of the transaction (e.g., by talking to the card holder).

Therefore we need to come up with a solution that stores the data securely but that also is reversible. Was it not for this requirement, hashing would be the simplest and most efficient way to address this problem.

Encryption to the rescue

The field of encryption is vast, but here we focus mostly on symmetric encryption. Essentially, a key K is used to encrypt a value (such as your card number) into a meaningless value K(card) that we can store safely. Contrarily to hashing, we can then reverse K(card) back to its original form through decryption with the same key K.

As such, we can perform the same approach as explained in the previous section, but this time using encryption with some key K. The challenge then becomes how to securely store the key K.

The answer is to rely on a Hardware Secure Module (HSM), which physically guarantees that the K is never leaked or exposed. By relying on an HSM to encrypt our sensitive data, and never storing that data in cleartext, we are compliant with PCI-DSS regulations, which are mandatory for handling cardholder data.

However, there is a huge problem in taking this approach, not in regards to safety or correctness, but one of performance. Recall that we are ingesting events and providing risk predictions in real-time to our clients: therefore we have to turn all sensitive data into encrypted values within a couple of milliseconds at a rate of thousands per second! Unfortunately, doing all of that in a HSM would be too slow (as we would fail our latency budget to respond) or too costly (if we had many HSM units).

Hierarchical encryption

Do you remember the Inception movie? Where you could dream that you were inside a dream? That is actually an analogy for how we solved the problem above.

In essence we cannot call the HSM for every real-time event. Instead, we create an intermediary encryption key DEK (standing for Data Encryption Key) that we cipher in the HSM (with its secret K) and store the resulting K(DEK) securely. Then, when our systems bootstrap, they call the HSM once only to decrypt DEK and maintain it in-memory for fast real-time encryption of incoming sensitive data.

This means that the HSM is used only to handle the encryption of our intermediary keys DEK. In turn, our intermediary keys are used for ciphering incoming sensitive data. We cannot do this with the HSM K key because it can never leave the HSM physical unit (which is slow and thus we cannot use it in real-time). The security of our intermediary keys DEK is ensured by that fact, as you will never be able to decipher it without the HSM. This hierarchical scheme between K and DEK satisfies both safety as well as performance requirements.

Tech behind the theory

All of the above is actually implemented in a standalone Feedzai component called Tokenizer. I was lucky be part of its development as it pushed us to the limit for guaranteeing the hard performance requirements of our systems while ensuring security for our users’ data.

This is a Scala/Java mixed project with Dropwizard serving a REST API using Cassandra for low-latency storage. We support PKCS#11 compliant HSM devices, such as the Thales nCipher. On our AWS deployments we store data on DynamoDB and rely on KMS as an HSM-as-a-service.

It has been over 2 years since we deployed the Tokenizer and it has processed over a billion events in real-time (which is just a small part of what we process, since many events do not need to have its sensitive data reversible). We do so with an average of 3ms per event, with 99.9% under 30ms, and handling 500ev/sec on a single Tokenizer instance (scaling linearly with the addition of more instances up to thousands per second).

All of this took us a total of 230 Git commits by 6 awesome developers (and several others who helped as well) across 21,210 lines of code (adding to which we have 16,312 lines of comments) — — of which roughly 65% are unit and integration tests.

In fact, those numbers were roughly half of that 1 year ago, back when we implemented a cool feature to rotate from the current encryption keys to new ones while never losing any data and continuing to operate with 100% uptime. Yes, you heard that right: we implemented a lock-free scheme where normal operations progress regardless of a concurrent encryption key rotation. This doubled our code base but it was worth it: our clients can now be confident that we can migrate their sensitive data to a new encryption key set (should there be any security issue with the previous one) while serving their users seamlessly.

Trust me, going into the details of that achievement is well worth another full page, so we better do it another time. If you cannot wait to learn more, you can also check us out, and help us with our next big achievement.

--

--