Explainability for Fraud: One Size Does Not Fit Fraudsters

Published in

Feedzai Techblog

13 min readJan 24, 2024

In this blog post introducing Explainable AI, find out how we make complex AI understandable for Fraud Analysts and ML Engineers/Data Scientists. Discover our methods that bring transparency, helping decision-makers better understand and trust AI models.

By Vladimir Balayan and Sérgio Jesus

One of the most important components of Responsible AI is Explainability. It helps us comprehend the rationale behind predictions and ensures that the decisions made by our AI systems are grounded in solid reasoning rather than spurious correlations. In this blogpost, we break down the Explainable AI field and how we integrate it with Feedzai’s tools for more informed decisions. These tools target two different personas — Fraud Analysts and ML Engineers/Data Scientists.

· Explainability landscape
∘ Definitions on Explanations
∘ Explanations Taxonomy
· Explainable AI Toolkit at Feedzai
∘ ML Engineer Tools for Debugging and Understanding Models
∘ Making Informed Decisions For Risk Analysts
· Conclusion

Explainability landscape

Before digging into details on how we use Explainability at Feedzai, let’s break down explanations by their type. In the Explainable AI (XAI) field, there are no one-size-fits-all solutions. Just as individuals have different expertise (and information levels), needs, and objectives, so do the explanation methods.

Definitions on Explanations

Although XAI has greatly evolved in recent years, a consensus has yet to be reached for some of its main concepts.

Interpretability vs. Explainability

There is still ongoing discussion in the XAI community regarding the differences between Interpretability, Explainability, and Transparency. Although it is common to use these terms interchangeably, some authors define each of them differently.

Interpretability and Transparency are related to the understanding of the decision-making process of an ML model or AI decision-making system, respectively. They deliver explanations in a way that a human can clearly understand how it works.

Normally, Interpretability is tied to simpler models such as Linear Regression, rule sets, and shallow tree-based models since their inner operations are clear, simple, and concise. For instance, a shallow tree-based model trained to predict if there will be rain given weather condition features is fully interpretable since we can track each step in the model’s decision process.

However, these simpler models usually underperform when compared to black box models such as Neural Networks. These more complex models are not interpretable. Thus, we resort to Explainability, with the goal of generating explanations for the final decisions of such models. Although the inner workings of those black box models are not interpretable due to the number of operations, there are several methods that can provide an explanation of how models process data, giving insights into the decision-making process.

Difference between Interpretable AI and Explainable AI

Personas in Explainability

To define the full scope of the XAI field, we start by dividing end-users of AI systems into three general categories (also known as personas):

AI Experts: These individuals have an extensive understanding and interest in the AI system. They are often ML Engineers who are involved in creating machine learning models, fine-tuning algorithms, and optimizing AI systems for a specific task.
Data Experts: Data Experts utilize AI systems to facilitate data-driven analysis or decision-making processes. They may include Fraud Analysts who rule the authenticity of transactions based on data patterns and AI-generated insights.
AI Consumers: This group comprises individuals who are either consumers or subjects of the AI decision-making process. For instance, someone applying for a loan at a bank may fall into this category, as they rely on AI-driven assessments to determine their eligibility.

Requirements of Explainability

While all of these personas interact with AI systems at various points, their distinct objectives and expertise translate into unique requirements for explanations that focus on different parts of the AI system operations. We can categorize these desired properties (desiderata) into several groups:

Fidelity: this pertains to how accurately the explanation approximates the actual behavior and decision boundary of the AI system. AI Experts often require highly faithful explanations to ensure that they actually reflect the system’s decision-making process.
Stability: this assesses the similarity of explanations for similar instances. Ensuring that explanations remain stable across similar interactions is important for Fraud Analysts since they rely on detecting repeated patterns to make decisions based on AI-generated insights.
Diversity: this focuses on whether the explanation’s components are distinct and non-redundant. For instance, if an explanation presents three correlated (or even redundant) features, it fails to provide diversity. Ensuring diverse explanations is essential across all user categories, as it also improves comprehensibility.
Comprehensibility or Human-Interpretability: this criterion evaluates how easily the explanation can be understood by humans. This is important for all the personas, especially AI Consumers, who benefit greatly from explanations that are clear and easy to grasp, even with limited technical expertise.
Usefulness: this measures the extent to which the explanation aids in optimizing the end user’s goals. For ML Engineers, for instance, a useful explanation is one that helps them debug models effectively and enhance their performance.

All the requirements are important for a good quality explanation, but Usefulness may be considered the most relevant one. For example, an explanation can be perfectly Stable and Diverse, but still not help a ML Engineer, because it has low fidelity. Usefulness can be considered a combination of several other requirements.

Given the multitude of explanation requirements, evaluating and assessing the quality of explanations is a complex task. Defining the target persona and understanding their specific requirements should be regarded as initial and essential steps in developing and deploying an explanation method.

At Feedzai, we’ve developed an internal A/B testing tool to evaluate different explanation methods and how they affect the end persona’s performance at a given task. We validated this methodology with a study in a real case scenario where Fraud Analysts reviewed transactions, together with an AI system’s outcomes and explanations. This resulted in a research paper published in FAccT.

Explanations Taxonomy

Explainable AI can be categorized based on two main criteria: where it focuses within the AI system and the type of output it produces. In this section, we will explore these categorizations.

Explanations by AI System Component

Depending on the approach of the method, several components of the AI system can be explained.

Even before building an ML model, we might want to examine some data patterns to get valuable insights. This is called Pre-model explainability and is normally generated in the exploratory data analysis step with domain knowledge from several experts.

During the phase of creating ML models, we may focus on In-model explainability, where we develop models with inherent explainability traits. In this category, we can build Interpretable models, where we previously loosely defined them as “simpler” and “less performing” in some cases.

Alternatively, we can create Self-explainable models. These learn to create explanations for their decisions along with the main predictive task.

Finally, we can explain the ML model externally after training. For this, we use Post-model explainability. This is the most common approach due to several factors. For instance, self-explainable models usually have increased complexity. This introduces a trade-off between explainability and performance.

With post-hoc explanations, ML Engineers can focus exclusively on model performance while training, delaying the explainability objectives to a later step. However, for this exact reason, this type of explainability may suffer from Fidelity and Stability issues. Normally, these are also known as model-agnostic explanations.

Additionally, each of the above explainability types can be divided into Local explainability, where an individual prediction (or a small fraction) is explained, or Global explainability, where a bigger sample or the whole dataset is explained.

Explanations Output

We can also categorize the explanations by their produced output. Different explanation methods can produce explanations in different formats. These are more or less suitable to the end users, depending on their requirements and expertise level.

Feature-based explanations present the features used by the ML model and their contribution to the final prediction. The most used methods of explanations are of this type and include LIME and SHAP Explainer.

As Humans, when trying to explain something, we usually give either similar or counterfactual examples to make our explanation easier to understand. This is the main idea behind Example-based explanations, whose methods output data instances that can be real or generated. These instances are either the closest results to the data point (similar examples) or the minimal changes in the input data that will make the ML model produce different predictions (counterfactual examples).

Delving into some specific black box model components may clarify how the decision process is performed. Model Internals explanations focus on the inner workings of the model, specifically on some of the more interpretable internal components. For instance, we can have a better understanding of the decision-making process of a Convolutional Neural Network (CNN) model by observing the extracted “features” via the learned filters.

Another typical characteristic of a human being is thinking about real-world phenomena in terms of high-level concepts. For instance, if we think about cats and how they are distinct from other animals, we can identify a set of concepts such as four-legged body, fur-covered coat, and distinctive pointed ears. Thus, Concept-based outputs can be considered as the most human-interpretable ones since the explanations are aligned with a human mental model.

We can further split concept-based explanation methods into model-Internal and model-External semantics. The concepts can either be derived automatically from the model and data, which we refer to as internal semantics. Alternatively, they can be obtained using external domain knowledge sources, designated as external semantics. The main difference between these two types of concept-based explanations is the cost of labeling and the quality of the extracted concepts.

In summary, there are many approaches to obtaining explanations in ML; the main takeaway is that the choice of method depends on the context of the problem. Most importantly, it hinges on the end persona who is interacting with the AI system, as well as their tasks and goals. Choosing the wrong explanation type might lead to confusion and hinder the user’s trust in the overall system.

Explainable AI Toolkit at Feedzai

When it comes to risk prevention, getting decisions right is crucial. Mistakenly flagging legitimate transactions as fraudulent can cause friction for innocent people, while missing a crafty fraud attempt leads to financial losses.

That’s why we employ explainability — it is a handy tool to help us navigate the gray areas of suspicion. In this section, we’ll explore the explanation tools Feedzai offers to make risk prevention more accurate and reliable.

ML Engineer Tools for Debugging and Understanding Models

Explanation methods serve as indispensable instruments in the toolkit of ML Engineers. Their primary objective is to validate and debug the underlying mechanics of ML models, refine algorithms, and thus achieve the best performance.

Tree Interpreter

Tree interpreter is a simple yet powerful technique that allows ML Engineers to assess the impact of the features used in their models. It is a built-in tool, compatible with tree-based methods, trained within or imported into Feedzai’s platform. Additionally, it is an in-model, feature-based explanation type that can be both global and local.

This method uses Information Gain (IG) to calculate the importance of each feature. IG is a measure obtained during the training step to select the best decision splits within a decision tree. A higher value of IG will increase the value of Feature Importance.

This type of explanation identifies which features have the most influence on the model’s global predictions. This can be extrapolated to show the relevance of different factors of transactions (e.g., features derived from the amount are usually important). This also assists in diagnosing potential issues during the training phase (e.g., identifying redundant features) and improving the model’s performance.

Ablation Study and Shuffle Feature Importance

Example of the visualization of the Shuffle Feature Importance within Feedzai Risk Management platform

An ablation study helps us dig into the importance of individual features in a model’s performance. It’s a step-by-step process where we remove one feature at a time and retrain the ML model without it. The goal is to understand the feature’s role in changing the model’s performance.

Similarly, Shuffle Feature Importance also looks at how a specific feature affects a performance metric, but in a different way. Instead of removing features, it randomly shuffles each feature’s values on a particular data sample and evaluates the model’s performance. By comparing the feature’s shuffled performance with its normal metrics, we can assess its importance. This allows for a quicker evaluation of feature importance when compared to an ablation study, as it only requires training a model once. However, it can still be time-consuming when run in large datasets with numerous features, as it has to shuffle all values of every single feature.

Both of these methods produce post hoc, feature-based, and global explanations.

TreeSHAP — LightGBM and Feedzai FairGBM interpretability

Before diving into TreeSHAP, we introduce the SHapley Additive exPlanations (SHAP) framework. The framework is based on the concept of Shapley values, inspired by cooperative game theory.

Shapley values assign a contribution to each player in a cooperative game. In the context of ML, each feature of the model is considered a player. These values help us allocate the contribution of each feature to the prediction, providing a more nuanced and interpretable understanding.

The SHAP framework has several implementations for specific models. This includes Deep Learning or tree-based models, as well as a model-agnostic approach.

Grounded in the SHAP framework, TreeSHAP retains all of SHAP’s core tenets with optimizations aimed at tree-based algorithms. It harnesses the inherent tree structure to enhance both scalability and computational efficiency when explaining models such as LightGBM, Random Forests, or Feedzai’s FairGBM.

This methodology offers a distinctive vantage point over other feature importance methods, as it calculates Shapley values for each feature. These directly translate to the contribution of a given feature to the score.

This method produces post hoc, feature-based explanations while offering flexibility to produce both local and global explanations. Additionally, it allows the visualization of any positive and negative correlations between the score and observed feature values.

TimeSHAP — Analyzing RNNs over Time

TimeSHAP is an adaptation of the SHAP framework specifically designed to facilitate the explanation of models that receive a sequence as an input, rather than isolated events, such as Recurrent Neural Networks (RNNs). This algorithm increases the space of explanations from features only to features and events.

It is a post hoc, feature and example-based, local, and global explanation method. It offers flexibility for analyzing the importance of the features, the events, or the combination of both. This provides a higher degree of explainability to sequence deep learning models, which are often regarded as hard to debug and interpret.

Read our in-depth blogpost to learn more about TimeSHAP!

Making Informed Decisions For Risk Analysts

The Risk Detection field is always evolving since fraudsters and other criminals uncover new ways to circumvent AI systems. Usually, the intervention of a human decision-maker, such as a Risk Analyst, is essential to disambiguate edge cases. These explanations help Risk Analysts make better decisions, quickly review suspicious transactions, and ultimately protect people from fraudsters.

Whitebox explanations

Whitebox explanations (WBE) is a tool that aims to provide insights for predictions of ML models in Case Manager, the Feedzai product used by risk analysts to review cases and alerts produced by AI Systems. When the analysts investigate cases, they have access to case information together with the ML model risk score.

WBE allow analysts to understand why a transaction received a certain score by showing the patterns that affect the ML score the most. These patterns are presented as natural language descriptions of the features that are easily interpretable for the analysts. Thus, WBE provide different reasons behind a transaction alert and insights into why the ML model is delivering higher or lower scores to certain transactions.

Concept-based explanations

Risk analysts often do not have a deep technical background in ML. But these professionals excel at spotting fraud patterns and distinguishing between fraudulent and legitimate activities.

Leveraging risk analysts’ expertise and reasoning skills, a more effective approach is to focus on highlighting the suspicious or genuine behaviors related to a model’s decision. In other words, it is about explaining the reasons or concepts behind the model’s predictions.

Thus, concept-based explanations allow risk analysts to quickly verify the necessary information that supports or contradicts these identified concepts. Instead of iterating through all available data or delving into intricate low-level explanations like feature contributions, they can concentrate on specific information. This streamlined process enhances the performance of the human + AI system.

Conclusion

In a high-risk domain such as Fraud Detection, where making wrong decisions can have severe consequences, AI Explainability is crucial. It helps build trust in the system and make more informed decisions.

When discussing explanations and methods, there is no one-size-fits-all approach. The most important points are: Who are the personas that will use the explanations? And what are their desired goals?

The decision starts with defining the end personas and their explanation requirements. After that, we should decide when we want to have explainability. For instance, it is possible to incorporate explainability when building an ML model (in-model explainability) or produce explanations for a model that is already trained (post-model).

Finally, given the end persona requirements defined in the beginning (for instance, diversity and human interpretability), we can decide how we want to present the explanations. If the persona has deep ML knowledge, most important features, or even some internal model mechanism can be used as the explanations. For high-level decision makers, counterfactual or concept-based explanations suit better.

At Feedzai, we’re working hard on tools that explain ML models and provide valuable insights about their decision process, taking into account the personas’ requirements. We want to give people the info they need to make smart decisions in situations where it really matters.