The Lifecycle View of Trustworthy AI

Published in

IBM Data Science in Practice

5 min readOct 22, 2021

Three monarch butterfly pupas hang from a lamp. Each pupa is in a different stage of the butterfly lifecycle. — Photo by Suzanne D. Williams on Unsplash

As artificial intelligence (AI) increasingly powers critical and high-risk enterprise workflows, developers of AI systems must ensure that the decisions AI makes for people can be trusted. Our team at IBM has previously described the five pillars of trust: fairness, robustness, explainability, privacy, and transparency. To help developers create trusted AI solutions, IBM Research has released multiple open-source toolkits in this space based around the pillars — AI Fairness 360, Adversarial Robustness 360, AI Explainability 360, AI Privacy 360, Uncertainty Quantification 360, Causal Inference 360, and AI FactSheets 360.

Since the first release of these tools, the team has learned a lot about designing essential tools for AI developers. In the process, we’ve realized that pillars may not be the most intuitive way for developers to pick up new tools. Today, we’ll talk through the next evolution of trust tools, where we start to design around the AI lifecycle.

Matching AI Developers’ Mental Models

Let’s take a look at this problem through a lens of a common example problem in trustworthy AI, that of automating mortgage approvals.

If you’re a developer building a mortgage approval model or an executive who has to approve the system’s behavior, you’re probably not thinking about the five pillars of trustworthy AI when you approach your problem space. Executives typically think Big Picture: “how can I ensure our customers trust our mortgage approval process?” Developers, meanwhile, just want to know, “how can I measure whether we are approving the right mortgages?”

Across industries and problem spaces, people tend not to think in terms of pillars of trust individually, but holistically, without differentiating much between fairness, privacy, robustness, explainability, and transparency. AI developers want to quickly connect requirements around safety and compliance to the exact steps they need to take. Typically, the developers who need trustworthy AI tools are already well-versed in framing their problems around the AI lifecycle. When those developers approach using trust tools, they want to know what extra operations to perform when preprocessing data, and what options they have to replace their classifier with a trust-informed one. As toolmakers, we should build our tools around the AI lifecycle so we can build on the concepts our users already know well.

After experimenting with IBM’s trust toolkits in the wild, we’ve found that the toolkits don’t yet achieve this level of usability. Users of trust tools encounter two fundamental barriers. First, mapping their problems onto the pillars isn’t natural for people who haven’t been deep in the trustworthy AI domain. Second, even once the trust problem is broken down with a pillar approach, crafting a complete solution from the tools available can be confusing.

Problem 1: Using Pillars to Understand Users’ Trust Concerns

a tiny toy house sitting on a table with a key on a key ring next to it — Photo by Tierra Mallorca on Unsplash

Our mortgage approval-building developer knows that the company’s users want answers to a simple question: why didn’t my mortgage get approved? But that question doesn’t map cleanly onto the pillars. Breaking down that question leads to more questions within, each of which map onto a different pillar (not all pillars will apply in a given problem):

Fairness — Why did my friend get approved, but I didn’t?
Explainability — What do I need to do differently to get approved next time?
Privacy — Is my data safe with this company?

Breaking down end user concerns in this way requires a deep understanding of the trust domain, which goes beyond the limits of even experienced machine learning developers and data scientists. Yet, when our mortgage approval developer goes to select a tool for her task, she’s faced with tools that assume she has this knowledge.

As builders of trust tools, we know if we want our tools to make a difference for our clients, we need to design them from a more holistic perspective. That perspective should include all of the major pillars in one tool, with more guidance about where to begin.

Problem 2: Combining Trust Methods Creates Conflicts

Say you’re the data scientist responsible for picking a machine learning algorithm to predict whether a mortgage should be approved. You know you want to make your model-building process trust-aware, so you go to check out what’s recommended for your problem.

What you find is a whole host of logistic regression models, divided across different toolkits to align with each pillar.

Fairness guides you towards the prejudice remover algorithm, which is a standard logistic regression with an additional discrimination-aware regularizer.
Explainability sends you down the path of the logistic rule regression algorithm, which extends the basic logistic regression with Boolean rule-based features and interactions.
Privacy might direct you towards the logistic regression that respects differential privacy.

Three different toolkits; three different and conflicting approaches. You can only select one model to predict which mortgages to approve, so you may conclude that you have to choose between a fair model, an explainable model, and a model that protects user privacy. Worse, if you’ve already trained a model using standard AI development tools, replacing your model with any trust-informed model may not seem worth the potential tradeoff in accuracy.

Navigating Trust Tools

a train on a raised trainway consisting of arches. One train track runs beneath one arch, and a dirt road underneath another. There is a mountain behind the raised trainway and the sky is visible with some clouds. — Photo by Azzedine Rouichi on Unsplash

By distributing the trust methods across the pillars, these conflicts aren’t obvious to the end user. If we rearrange the trust methods to distribute them across the AI lifecycle, it becomes easier to see possible trust modifications that address all of the relevant pillars.

For fairness, you might focus on data rebalancing approaches, attaching weights to the training data points so that traditionally unprivileged groups in the dataset are given increased importance by the machine learning algorithms. Reweighting samples like this is a common trust operation, used across the various trust pillars.
For explainability, you might explore surrogate model approaches, in which you train an alternative model at train or predict time which is more interpretable than your original model.
For privacy, you might apply data transforms that generate an artificial version of the training data with the same distribution as the original training data set.

Since these methods address different parts of the lifecycle, they can be used together without conflicts. By framing our approach across the entire machine learning lifecycle, and offering users a greater understanding of the essential operations that make up trust solutions, we can build tools that allow complete solutions for trust.

Designing Trust Tools for Everyone

We’ve learned quite a bit since we first released the 360 toolkits organized around the trustworthy AI pillars. Now that we’ve exposed the challenges developers encounter with current trust tools, we can start reimagining tools that fit more naturally with client and user needs around trust. Stay tuned for the next installment of our journey, in which we’ll tell you more about how we’re progressing toward a solution architecture.