Building Responsible AI for Credible Machine Learning

Dr. Scott Zoldi, Chief Analytics Officer of FICO, recently addressed a Responsible AI panel in São Paulo, Brazil. This article is an edited version of the transcript from his talk

Published in

CEPI FGV — FEED

8 min readMar 6, 2023

It seems that the tech world is in an arms race to build the most powerful, and perhaps most headline-grabbing, artificial intelligence (AI) systems. But to state the obvious, a big part of the problem in the AI arms race is that data scientists are not building models in ways that they themselves can understand. Models are treated as data science projects; larger, more complex models are deemed more interesting, and data scientists are focused on predictive power as the ultimate performance metric. These data scientists are not focused on interpretability or explainability, and without these priorities, it’s difficult explain how AI models work with any level of certainty.

A lot of Explainable AI … isn’t

To make things worse, the AI industry has created explainable AI algorithms which also are not truly effective. Explainable AI has been around for more than two decades; it’s not a new concept, but depending on which explainable AI algorithm is used, data scientists get different reasons for why the model produced the decision it did.

As a result, when data scientists build AI models they cannot explain, they apply an explainable AI algorithm to produce reasons. But since the models are not interpretable, there is a low certainty that the reason why the model produced the decision it did is correct. This is not a very good way to explain a model.

Machine Learning injects uncertainty and bias

Machine learning (ML) is the fuel that feeds AI systems. But the problem is, it’s very difficult to explain how machine learning works.

**Figure 1: A neural network with three latent features**

Figure 1 shows a diagram of a neural network with three latent features. Latent features are responsible for learning relationships between all input connections and applying a Tanh() function to determine if a specific latent feature is activated. If it is, the latent feature passes that information to the next layer. Latent features that saturate can alert to behaviors which, in combination, can lead to “high credit risk” or “low credit risk” in the model’s decisioning.

The challenge is, there are many combinations of the inputs which can cause the latent feature to saturate. We call these ‘modes of activation’; when latent features saturate, we do not have certainty as to which relationship(s) actually drove the outcome, since this issue is typically multi-modal.

To resolve what is causing the latent features to fire, and which of multiple modes are expressed when that latent feature fires, we need new types of algorithms. At FICO we are focused on interpretable machine learning algorithms. We emphasize that all ML models need to be interpretable, not explainable.

Steps toward ML interpretability

FICO achieves interpretability by applying a different type of constraint in training the neural network, one that minimizes the connections. As a result, we minimize the number of modes that can activate the latent feature. Moreover, this is how FICO addresses bias in models, by understanding how the learned modes of activation differ across classes of customers. It is illegal in many parts of the world to address bias by looking at the outcome of a model, such as by looking at the Orange Group and the Purple Group in Figure 2. Here, the Orange Group will get consistently higher scores than the Purple Group.

**Figure 2: The impact of biased decisioning at scale**

Latent features activation should be similar across classes. This doesn’t reflect who’s going to get more credit and who’s going to get less credit. What it does say is that, for example, for latent feature 33, the relationship between the total mortgage balance and the number of collection events on amounts greater than $500 drives a different rate of activation in these two classes. We can remove that relationship by putting additional constraints into this interpretable model that disallow the relationship learned by latent feature 33. In this way, we expose inherently biased relationships between inputs and remove them from the solution — illustrating the paramount importance of using an interpretable neural network.

Choosing a transparent architecture

We can understand which of these latent features fire and what drives them to fire. We can also indicate what relationships have shown bias in our data and disallow them in the solution. This is critical from a Responsible AI perspective because we, as data scientists, can choose an architecture that is transparent. We can choose an architecture that exposes bias, and we can remove that bias by applying additional constraints.

This approach is central to how we at FICO view responsible AI usage; it requires this level of visibility. Some people will say, “The bigger model has to be better. It learns more things about the data.” The truth is that more complicated models which are not interpretable, particularly out of time, generally don’t perform any better. They’ve learned noise about the modeling data set, they’ve learned biases, and they’re generally not going to be robust out of time when we start to use the model. So, a key part of using AI responsibly is not only building a model that is interpretable and ethical, but to show that it is in the production environment.

Audible AI is essential

Today, most organizations do not have a record of how an AI model was built; they have to ask a data scientist. The business owners generally do not understand how the model works, and they simply blindly trust the AI. Therefore we need, as a part of Responsible AI and AI governance, to define a corporate model development standard for how all models will be developed. With this standard there is an approved set of algorithms for processing data, an approved set of algorithms to build the model, and an approved set of algorithms for how to achieve an interpretable model and Ethical AI.

The reality is, if you have 100 data scientists and lack a single corporate model development standard, you’ll end up having 100 different ways that data scientists build models. This approach is not governed and is downright dangerous. Auditability involves developing a blueprint of which model development process an organization will use, and then demonstrate that it has been followed by the data science team.

Codifying the process with blockchain technology

At FICO the model development governance standard involves two major components:

Codification of the model development process
An audit trail for operational use of the model.

First, every model development entails a list of requirements and objectives. It defines which work is to be done in different development sprints, success criteria and how progress is monitored. It identifies named resources who are working on the model requirements, as well as approvals in how the model requirements were met.

To demonstrate that these requirements are taken seriously, we put model development management on a blockchain. FICO uses blockchain technology because it provides three critical pieces of information:

An immutable record of the decisions made throughout the model development process
Whether or not different aspects of the development process were followed
What the resulting outcomes look like in production, what to monitor and when the model may be failing Responsible AI guiderails in operation.

By using blockchain for model development management, we know what drives a model and how it works. We also know what would cause the model to not work well in production and how to remediate it.

The blockchain shows both successes and failures in working to meet the requirements, all the way to the point where we meet, the final model. This drives accountability; data scientists take their jobs very seriously when they have signed their name to the immutable blockchain on work products associated with model development. The quality of the models goes up as the seriousness and the impact of building these models become clear.

This is what model governance is all about; as the model is developed, it’s built to a model development standard that enforces the company´s best practices and shows they were adhered to. It creates a record that can be presented to regulators and, moreover, it specifies the specific conditions under which this model is valid to use in production. Ultimately, this is what makes machine learning and artificial intelligence credible, and AI responsible.

Dr. Scott Zoldi is chief analytics officer at FICO, responsible for artificial intelligence (AI) and analytic innovation across FICO’s product and technology solutions. While at FICO, he has authored more than 120+ analytic patents, with 80 granted and 47 pending. Scott is an industry leader at the forefront of Responsible AI, and an outspoken proponent of AI governance and regulation. His groundbreaking work in AI model development governance, including a patented use of blockchain technology for this application, has helped propel Scott to AI visionary status, with recent awards received including a Future Thinking Award at Corinium Global’s Business of Data Gala. Scott serves on the Boards of Directors of Software San Diego and San Diego Cyber Center of Excellence. He received his Ph.D. degree in theoretical and computational physics from Duke University.

Sobre o projeto

O texto acima faz parte do Projeto “Governança em Inteligência Artificial (IA): framework para a criação do Comitê de Ética para projetos de IA”. Os objetivos da pesquisa são, a partir de uma revisão sistemática da literatura existente e de debates com os principais stakeholders desse ecossistema: (i) desenvolver um framework para a formação do comitê de ética em IA, com a definição de seus objetivos, modelos, atribuições, responsabilidades e poderes, estrutura e membros; (ii) elaborar artigos acadêmicos sobre o tema; e (iii) apresentar os resultados em um evento acadêmico para ampla divulgação dos principais achados. Acompanhe o projeto neste canal e em nossas redes sociais para ficar por dentro das discussões.

Ele é desenvolvido pelo CEPI FGV Direito SP, sob a Coordenação da Professora Marina Feferbaum, com a liderança de Alexandre Zavaglia e Guilherme Forma Klafke, e é formado por equipe de pesquisa composta por Deíse Camargo Maito e Lucas Maldonado Diz Latini. A pesquisa tem o apoio das empresas B3 — A Bolsa do Brasil, DASA, FICO e Hub Mandic.

Como citar este artigo:

CEPI. Building Responsible AI for Credible Machine Learning. Medium, 2023. Disponível em: https://medium.com/o-centro-de-ensino-e-pesquisa-em-inova%C3%A7%C3%A3o-est%C3%A1/building-responsible-ai-for-credible-machine-learning-98d6db33babc

As opiniões expressas neste artigo são de responsabilidade exclusiva dos autores, não refletindo necessariamente a opinião institucional do CEPI e/ou da FGV e/ou das instituições parceiras.

Siga o CEPI da FGV Direito SP nas redes sociais!

Você pode também receber por e-mail novidades sobre eventos, pesquisas e produtos do CEPI! Cadastre-se em nosso mailing aqui.

Para saber mais sobre o CEPI, acesse o nosso site.