Explainable AI: a strategic tool for the enterprises

A viewpoint on using explainability to guide AI systems design upfront

Published in

CodeX

24 min readJul 24, 2021

I guess many of you will agree that XAI (eXplainable AI) is one of the most important topics in artificial intelligence field for the last several years, especially from the AI adoption perspective in the enterprises.

Understanding and evaluating AI systems’ mechanism in looking into input data and producing output is very important because it helps AI systems earn the ‘universality’ and penetrate our daily lives and business processes successfully. ‘XAI’ will play a pivotal role in building trust and establish the mechanism of mutual learning, ultimately nurturing natural collaboration between humankind and machines.

Since several years ago, a variety of stakeholders in academia and industries has been preaching the importance of XAI, and not only big tech companies such as IBM, Google, MS, but also the startups in this niche market such as Truera, Fiddler Labs, Kindi, are talking about and selling XAI solutions.

In this article, I’d like to revisit what’s the fundamental meaning of XAI, the common goals companies are applying XAI for at the moment, and a strategic viewpoint we need to think of to leverage XAI for achieving competitive advantage for enterprises.

Below is the table of contents:

The complexity of the ‘why’ question
The Essential meaning of XAI, and 6 practical needs
The limitation in current mainstream discussion on XAI
Beyond ‘universality’ : XAI as a strategic tool

On May 7th, 2016, a forty-year-old man named Joshua Brown was driving his Tesla Model S sedan in Florida, USA. This Tesla S was in ‘autopilot’ mode when it collided with a tractor-trailer that was crossing its path on US Highway 27A. Joshua Brown passed away in this accident, and NHTSA started the investigation right after Tesla notified NHTSA of the accident.

According to the report, while Tesla Model S was in ‘autopilot’ mode on the highway, the tractor-trailer with its side painted in white was making a left turn. Ideally, either the driver or the autopilot system should have activated the brake system, but it didn’t happen. The driver, Joshua Brown, was fatally injured when the front window of Model S, which was traveling at 120 km/hour, collided with the bottom part of the trailer truck. NHTSA said in a preliminary survey, “The sky was very clear at the time of the accident, making it difficult to distinguish the white-painted trailer truck from the background — the sky, and therefore the Model S and the driver did not recognize the trailer and therefore did not activate the brake system.”

It was Tesla’s first death in ‘autopilot’ mode operation, and it attracted numerous media and general public attention. It’s probably a natural reaction to wonder why this happened, and everyone’s interested in what to do to prevent similar incidents from happening in the future, who’s responsible if they happen, and most of all, how these technologies should be improved in the future.

The Complexity of ‘Why’ Question

The question of ‘why’ is very complex and difficult to answer. Maybe it’s fair to say the question of ‘why’ has many different contexts in itself.

From an HCI (Human-Computer Interaction) designers’ perspective, the key question will be ‘why the driver came to believe in the autopilot system to the point where he thought he should not be watching the road ahead even when the system repeatedly warns him?’. Perhaps the designer will redesign the warning system of vehicle software or create a new mechanism that will make users watch forward (in fact, Tesla modified the autopilot system to operate with stricter standards after this accident).

A transportation systems engineer would be interested in improving the visibility of various road signs. The truck designer or traffic regulator would have asked the question of whether a different installation of a median strip or guardrail would have prevented the vehicle from being crushed under the trailer truck and consequently prevented the driver from dying.

In the case of an artificial intelligence researcher, the question must be ‘why the autopilot system didn’t recognize the trailer truck visible to the human eye?’. After the accident, Tesla said “Not only the autopilot system but also the driver himself could not distinguish the white-colored side of the trailer truck from the sky, which was very bright at the time. That’s why the braking system didn’t engage in this situation.” Later, Tesla CEO Elon Musk said, “To prevent incorrect brake operation, the radar system has been set to ignore unnecessary road objects or signals.”

We still don’t know what technology Tesla uses to design these systems, or if the explanation is really accurate. All of the questions we’ve looked at above are questions that need to be addressed under the theme of AI Explainability.

If we accept these different angles of explanation, the next question will be how to improve the computer vision model — by training with more data to detect white trucks in the bright sky, or by changing the model architecture itself somehow. Perhaps it is necessary to improve the radar system to prevent such incidents without generating a false positive signal.

Fundamental Meaning of XAI : ‘Learning’ Different Viewpoints

As you can see in the paragraph above, the explainability of artificial intelligence is often referred to in the context of identifying why the (expected) abnormal behavior of artificial intelligence system occurred, and of course this is a very important and reasonable reason for the existence of XAI.

However, I believe the meaning of XAI at the fundamental level is 1) to help us human to check and understand how artificial intelligence system ‘recognizes’ the objects and phenomenon in somewhat different ways, and 2) to help us ‘learn’ new ways to understand the world ourselves, eventually reaching beyond our traditional viewpoints. Hence, XAI is one of the important pieces we need to incorporate in any artificial intelligence system so we can complete the ‘continuous feedback loop’ of mutual learning in both ways.

As I wrote in my other article (‘Memorable Match between AlphaGo vs Lee Sedol’), if Lee Sedol, one of the greatest Go players in the world, has gained a new perspective on the creativity in the game of Go from AlphaGo, there must be huge potential for us to learn from various artificial intelligence systems we encounter on a daily basis.

‘XAI’ should be treated as one of the key tools to make intelligent collaboration between humans and machines possible with the goal to extend the knowledge and experiences we’ve built over a long period of time.

Practical Reasons for AI Explainability

‘AI Explainability’ is used to investigate the model’s abnormal behavior or failure as we can see above in the case of autonomous vehicle accident, but we can apply XAI techniques for many other purposes.

There are lots of articles and papers describing the practical reasons why we need XAI, and I think most of the reasons fall into one of the six categories I mention below:

1. Generalization

One of the basic best practices in the training process of artificial intelligence models is to separate training data into different sets. It is widely known that there can be AI models that perform the same level of performance based on the “validation dataset” but differ significantly from the “test dataset”. In other words, the model performance at the ‘validation phase’ does not function as a valid indicator of how this AI model will perform in the ‘real world’.

If a model works well with training data but not with real-world data, we call it ‘the model is not generalized’. If we can explain what part of the data the model focuses on during its calculation, the model developer will be able to choose a model that is more likely to ‘generalize’ among several models that have the same performance on the validation dataset.

[Ribeiro 2016] Raw data and explanation of a bad model’s prediction in the “Husky vs Wolf” task

Let me give you a well-known example. Imagine a computer vision model trained to distinguish between “husky” and “wolf” images. In the development process, the model showed high performance in distinguishing the images of wolves. But when deployed in the real environment, this model repeatedly identifies the images of ‘husky’ as ‘wolf’ while the images shown are clearly the ones of ‘wolf’ to the human eye.

Using an explainability technique, we observe what areas of the image the AI model is looking at to determine whether it is an image of the wolf, and we find that the model strongly associates the ‘snow’ in the image and its conclusion that the image is of the wolf. This association happened due to the coincidence that among the pictures used in the training process, there were many pictures of ‘wolf’ with the background including ‘snow’.

If this factor was identified in the training process, then the developer would have realized that the training dataset is not properly collected, and would have predicted that the model will not likely to “generalize” to the real-world environment.

In some cases, we encounter explanations that are ‘against’ our intuitions. It is important for the stakeholders — not only developers but also users of AI systems — to investigate the data or phenomenon themselves and to apply explainability techniques so they can determine whether the phenomenon is actually significant or needs to be rejected.

2. Regulatory Compliance, Accountability, and Fairness

For industries in a strong regulatory environment — e.g., financial services industry — , it is often mandatory to provide an “explanation” of the prediction by artificial intelligence models. In the United States, for example, a customer applying for a loan is guaranteed the right to ask the lender to explain the reason why the application is rejected, under the Equal Credit Opportunity Act (ECOA). Therefore, the introduction of AI services into these areas of business requires the use of ‘explainable’ artificial intelligence models.

As the artificial intelligence industry is still in its infancy, the regulatory environment is still immature but rapidly changing. The European Union (EU) recently enacted the General Data Protection Regulation (GDPR) to enable EU citizens to ‘request significant information about the logic associated with any automated decision making’. In many cases, this is understood as a ‘right to explanation’, so if it is an AI-based decision that can affect EU citizens, it is highly likely that explainability will be required for that as well.

David Hansson’s Twitter on the bias in the algorithm for Apple Card applications

An incident in 2019 could be an example. Apple has collaborated with Goldman Sachs and MasterCard to release the physical credit card. It was pointed out that the artificial intelligence model shows social biases related to gender in imposing credit limits on users.

In early November 2019, a few months after Apple Card was released, ‘Ruby on Rails’ developer David Hansson tweeted that Apple’s algorithm said his wife’s credit card limit is only one-twentieth of his even when they submitted the same tax data and lived together for a very long time.

Many people, including Steve Wozniak, experienced similar situations and demanded that Apple Card subscriber credit limit calculation algorithms be corrected. And New York Department of Financial Services also started to investigate whether Apple and Goldman Sachs’ credit limit setting practices violate New York state law.

It is not a simple matter to determine how to approach and find solutions to these issues. Simply ‘eliminating’ the feature with issues, just because the model shows ‘discriminatory’ aspects from the general public perspective, might not be the optimal solution. Specific contexts in which the algorithms operate need to be considered to deeply understand and apply the appropriate control of tension between public interest and private autonomy (e.g., strategic choice).

If something goes wrong in a very important decision-making process, a method of determining the responsibility is also required. If the problem was in the inaccurate prediction of the AI model, it would be possible to determine the party with the responsibility only if we understood the cause of the error. You’d easily guess this is a very complex area that leads us to social and legal discussions beyond the realm of artificial intelligence system development. Nevertheless, artificial intelligence systems deployed in mission-critical business environments must be developed with this need of ‘accountability’ in mind. If we can explain the reason for the wrong prediction, the source of the responsibility can be identified so that we can plan and execute the solution to prevent the same problem from occurring in the future.

The artificial intelligence model learns our own biases and prejudices embedded in the dataset. Plenty of examples are out there — voice recognition systems such as Siri or Alexa misunderstanding the African-American voices [Re: Knight 2017], or an image recognition system making a strong correlation between the gender of women and the kitchen space [Re: Simonite 2017].

We should not blindly trust an artificial intelligence system that can learn bias. If not properly controlled, these AI systems can again magnify and reproduce the biases we have, and consequently contribute to further exacerbating socio-economic problems. Similar to the problem of identifying where the responsibility is, explainability can play a key role in identifying biases that may exist within the AI model.

3. Debugging & Enhancing AI Models

While artificial intelligence models can be very powerful tools, not only is the training process tough, but sometimes it is not easy to generalize to the extent we want. If the model remains as a black box, finding the cause of the error and improving the model is difficult and time-consuming. To make decisions about whether to increase the number of neural network layers in a model architecture or to obtain more training datasets in a particular category, the model designer has to examine failure cases, develop a hypothesis, and validate them through experimentations including modifications and retraining.

Considering the fact that some of the latest AI models require days, weeks, or more to train, and that the development of AI models itself is an “iterative” process, it is easy to assume that this debugging process is a very expensive task. Explainability techniques, when applied properly, can provide clues as to why models make bad predictions, and enable model designers to identify the underlying problems that really need to be addressed faster or with priority.

Several explainability techniques are commonly used in the debugging process of artificial intelligence models, including:

‘A method to extract input elements that contribute to the model’s output [Ribeiro 2016, Shrikumar 2017, Hara 2018]’ can help us remove the feature(s) or adjust the weight of them in case they get too much attention. It also helps you determine what ‘bias’ is embedded in your data.
With ‘a method to find a training “sample” that has the greatest impact on a specific prediction in an artificial intelligence model [Koh 2017]’, we can infer the issues in the dataset from sample data, or provide hints on the data and the complementary direction of the data that needs to be obtained in the future.
‘A method to discover and visualize the characteristics of each element that constitutes an artificial intelligence model (e.g., each layer and neuron in a deep neural network) during its operations [Olah 2018]’ helps us determine whether the element finds significant concepts or features from its input values. For example, the DeepEyes visual analytics system allows a detailed analysis of individual layers of deep convolutional neural networks to find and remove ‘dead filters’ that are always enabled or rarely active.

Explainability : DeepEyes visual analytics system

4. Human-in-the-loop Design

The concept of ‘human-in-the-loop’ in artificial intelligence is generally a structure in which algorithms provide information such as implications, recommendations, etc. to ‘human workers’ who need to make judgments and take action based on information. Effective deployment of these ‘human-in-the-loop’ systems naturally requires explainability techniques that allow human workers to understand the algorithm and anticipate how it will react while interacting with it.

While it is important to introduce the concept of “human-in-the-loop” in all AI systems, its importance seems to be well accepted especially in AI-assisted design tools. These tools are — in many cases — based on generative models to design specific objects (e.g., engines, ships, vehicles), which you can create by adjusting various variables in the virtual world without the physical work required to create real objects.

Simulation study results for the Elbo Chair, an earlier project using Autodesk generative design technology to identify the best chair shape

The image above is one of the screens of Autodesk’s AI-based design tools research concepts, which allows the system to design various chairs and identify the best chair shape while adjusting the variables in the latent space provided by the system using generative models. You can move the sliders shown on the right side of the illustration to change the different elements required for the chair design.

This approach is not limited to designing ‘physical’ objects, but can also be applied to solving problems such as traditional operations research (OR) — for example, the placement of hourly movements of hospital staff by skillset. In this case, the scheduler will enter various constraints into the system to optimize the nurse’s schedule/passage, adjust the work path accordingly, or select the optimal value from the various candidate paths recommended by the system.

Some might think that, for systems looking for such an optimal solution, you can define a mathematically quantitative desirability function and allow systems that operate on a generative model to find ‘optimal’ design values. This means that there may be no need for ‘human intervention’ in the design process, or that this intervention is necessary only after the design of the objective function has been completed.

However, this approach is likely to be unrealistic (and in many cases impossible or undesirable) to address the majority of interesting optimization problems.

Let’s look at the examples of vehicle design.

First of all, the vehicle design process has a variety of aesthetic and marketing considerations and decision-making points that are hard to simply leave to the system. For example, there’s a vehicle design term known as California Rake (also called California Tilt or Cowboy Rake). This is a design that helps sell the vehicle by giving it an image that makes the front of the vehicle very slightly lower than the rear. While attempts can of course be made to quantify and incorporate this element into the objective function under the name of ‘degree of tilt in the vehicle line’, trends in these aesthetic elements vary with age, generation, and so on, and need to be adjusted with design objectives. In this case, it is recommended that the design of ‘human-in-the-loop’ gives weight to the feature, and that the human monitors the correlation between these elements and other elements and makes the final design decision.

In addition, we need to consider the external constraints outside the product itself — oftentimes these constraints come from the complexity of the manufacturing process. Imagine a situation where the assembly workers in the factory have to weld certain parts of the vehicle designed with an AI system with optimal at a very hard and unnatural angle. There must be some feedback process to change the design of the vehicles, reflecting assembly workers’ opinions.

Ultimately, the ‘optimal’ design solely calculated by AI-assisted design tools might not be optimal, and thus should not be blindly accepted without any intervention or control. The process and interim result should be readily available to people, or designers, with the toolset to help them progress throughout the design process. So how does the designer interact with the algorithm and adjust the output of the algorithm? In the vehicle design example we’ve looked at, it’s too much if you’ve deployed sliders that don’t even show what they’re capable of, and you’re changing them around and hoping that the designer can produce good design results. We need to provide ‘intuitive’ tools for the designers to easily understand what each slider does, and overall, to develop a mental model of the logic the design tool operates with in changing the design output, so that the designer can better anticipate the order of activities they need to conduct to achieve the best design results.

Fortunately, interaction design and human-computer interaction (HCI) are pretty mature with lots of researches done, so you can find or create appropriate guidelines within the context of artificial intelligence systems. Don Norman, director of The Design Lab at the University of California, San Diego, summarizes seven key principles of design in his masterpiece, “The Design of Everyday Things” as below:

Discoverability : It should be easy to understand the current state of the target, what you can do at this point, location, and time.
Feedback : The information in a new state that results from a particular action should be provided sufficiently and contextually.
Conceptual Model : The system’s structure and functions as a whole should be easily and intuitively understandable.
Affordances : You should elicit the users’ desired behavior naturally from the service/product usage scenario.
Signifiers : The design elements need to be deployed in ways to strengthen affordance and to help users recognize the required actions naturally.
Mapping : Users should be able to understand the relationship between two different elements without special explanations or help.
Constraints : Appropriate guidelines of action should be presented to minimize unnecessary behavior.

All seven principles are closely linked to explainability. An important element of ‘good explanation’ — perhaps the most important — is to help users understand the ‘conceptual model’ of the system. ‘Mapping’ is directly related to ‘interpretable representation (refer to https://www.youtube.com/watch?v=N8ClViZqJTQ) or ‘disentangled presentation (refer to https://towardsdatascience.com/this-google-experiment-destroyed-some-of-the-assumptions-of-representation-learning-f430334602a9), and the ‘explanation’ itself is a process of exchanging ‘feedbacks’ using multiple ‘signs’.

5. Protection against Adversarial Attacks

In the context of an artificial intelligence system, the ‘adversarial attack’ is the manipulation of the inputs into a model in a way that is unrecognizable to humans, thereby invalidating or misleading the prediction results of the model.

Examples of the wrong predictions from the introduction of human-incomprehensible noise

Above is an example of how the adversarial attack works. A CNN model, while performing well in identifying the images of several species of dogs as ‘dog’, suddenly makes mistakes dogs as fish when a little noise is added to the input images. As you see, the images before and after the noise introduction are not confusing to human eyes.

On the contrary, ‘adversarial defense’ aims to protect and strengthen the model from such attacks or manipulations. Adversarial machine learning has been one of the most aggressive areas of research over the years, as this robust model is needed to deploy artificial intelligence models in real-world environments.

XAI and adversarial machine learning are deeply related. Explainability techniques can also be used to create training samples for attacks (Understanding Black-box Predictions via Influence Functions) or to find these data for defense (Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples). It is also well known that several methods used to make the model more robust in reverse result in a ‘more interpretable’ model (Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing Their Input Gradients).

Some even argue that the fundamental reason why artificial intelligence models can easily make false predictions with just a little noise is because of ‘non-robust features’, which are highly predictive but unstable and incomprehensible features from the distribution information of the input data. (Adversarial Examples Are Not Bugs, They Are Features). In other words, to make a strong model, you need to build a strong feature presentation that is more relevant to human understanding and perception than just a regular mechanical presentation.

6. New Knowledge Discovery based on Hypothesis

The task of explaining the prediction results of artificial intelligence models is often understood as an interpretation of the ‘thought process’ underlying the model through the comparison with the prior tendency or pattern based on domain knowledge and common sense.

However, in some areas of science such as biology, physics where empirical approaches are valued because there is no prior knowledge, the attempt of “explanation” itself plays an important role in creating the new hypothesis. In the pharmaceutical industry, we evaluate new hypotheses through basic experiments in laboratories and subsequent clinical trials. And if this hypothesis is proven, it’s a new scientific discovery. [Strum 2016, Schütt 2017]. So in this area of science, explainability becomes a more important requirement for applying artificial intelligence.

In the area of Reinforcement Learning, on the other hand, a description of the policy that a model predicts is considered something more than just a potential hypothesis and closer to a real ‘theory’. This is because we can assume the agent has already made an experimental assessment of the hypothesis implicitly while interacting with the simulated environment.

Limitation of Mainstream XAI Discussion: Viewpoint on XAI as a Tool towards ‘Universal AI Adoption’

While most of the practical reasons behind the discussions on XAI falls within the six categories I mentioned earlier, all of them are typically used with the same goal in mind — to use XAI as a tool to support safe and widespread adoption of the artificial intelligence system.

But then, is that the final goal of XAI?

‘Universality’ in the context of AI means the minimum condition and substructure for the acceptance of AI into businesses, societies, and our daily lives, as an element to cooperate with based on trust. Using XAI as a tool to gain ‘universality’ is certainly significant because AI is still in its early stage of adoption especially in the context of the enterprise technology stack.

But XAI has a huge potential as something else beyond a mere tool to secure safe and universal adoption of AI. Let’s take a look at the early stage development of the automotive industry to gain some insight.

1924 Ford Model T brochure (National Automotive History Collection)

No one can deny that Ford’s integrated production process, represented by Model T, played a very big role in popularizing automobiles and getting everyone on the road. When Model T production began, Ford’s production method was not much different from before, still relying on the manual labor of the operator. However, the process was restructured starting with the operation of the Highland Park plant in 1910, and in 1913, an integrated production process using conveyor belts emerged. The efficiency and speed of production increased quickly, and the price of vehicles got lower and lower along with the increasing volume of parts purchase, resulting in the price of vehicles falling from $900 per unit in 1910 to $260 in 1925. With this ‘affordable’ price as a weapon, Model T opened the era of automobile popularization, recording the first 15 million units production in history.

But then, after the ‘access’ to cars became ‘universal’ to the general public with standardized automobiles and affordable prices, consumers quickly lost interest in car purchases and the sales growth plummeted. To make matters worse, competitors’ movement to adopt the integrated production processes in their factories led to fierce competition, and Ford handed over the top market share to GM Chevrolet soon after.

Here, Alfred Sloan, who has led GM since 1923, drove the next-generation growth of the automotive industry with a new management philosophy, called ‘Sloanism’ today. Alfred Sloan is one of the legendary leaders in business history, who not only designed and introduced a modern decentralized corporate structure into GM but also set the cornerstone of today’s marketing strategy of segmentation and differentiation with the slogan “A car for every purse and purpose”.

Harley Earl and the concept car ‘Firebird II’

Since taking office as president, Sloan grew GM in pursuit of differentiation in price, color, and design. He was so enthusiastic about ‘design-oriented vehicle design’ and worked very closely with Harley Earl who was in charge of GM’s design division. Sloan’s partnership with Harley is often compared to the relationship between Apple’s Steve Jobs and Jonathan Ive. Sloan and Earl undeniably played a key role together in opening ‘the era of car styling’, strengthening the status of the ‘art and color’ department — often called ‘hairshop’ by others before — to lead the engineering and the sales department.

In fact, with the advent of Model T, the basic structure and the technology for mass-production vehicles did not change much until World War II, except for some technologies that enhanced driving convenience. Alfred Sloan had the strong belief that the fundamental of automobile company lies not in ‘making cars’ but in ‘making profits’, and put lots of thought and effort ‘to use design as the centerpiece of differentiating factors while leveraging GM’s technologies — which are at the same level as the competitors’ best technologies.

What we know as ‘Planned Obsolescence’ — a policy of planning or designing a product with an artificially limited useful life or a purposely frail design so that it becomes obsolete after a certain pre-determined period of time upon which it decrementally functions or suddenly ceases to function, or might be perceived as unfashionable— came from Sloan’s philosophy, and this is the beginning of ‘facelift’ that we take for granted today.

The reason I’m talking about the growth history of the automotive industry, which seems to have nothing to do with ‘the role of XAI’ at first glance, is that after new technologies are introduced to the market and become ‘accessible’, they must deal with the problem of ‘differentiation’ — how to use technology as a weapon for competition in the context of specific industries and companies. From that point of view, beyond the general reasons why the XAI is needed, we need a strategic approach and consideration to ‘how to leverage XAI to strengthen the capabilities of the enterprise and differentiate itself from its competitors’.

Beyond ‘Universality’: Explainable AI as a Tool for Strategic Differentiation

Approach XAI as a ‘strategic tool to gain competitive differentiation’ is meaningful from the two perspectives as below:

Justification of Investment required for XAI Research and Adoption

In principle, putting an additional layers of ‘explainability’ onto a ‘seemingly working’ artificial intelligence model or changing the model itself to be ‘explainable’ requires a considerable amount of investment from a corporate decision-making perspective.

3 major reasons why explainability is hard

Typically, you might have to deal with the problems such as ‘performance degradation’ as a result of updating the existing AI model to be explainable. This consideration of ‘trade-off’ is, in many cases, inevitable because for a machine learning model to be used with its full potential, the end-users or key stakeholders need to be able to understand how the machine learning model works and its boundaries of operations.

(Note: While the majority of machine learning models we use today are believed to have a trade-off between model performance and explainability, it is unclear whether this trade-off is inherent in the ‘machine learning’ technology itself. While MIT’s paper released in September 2019 suggests that there is an inherent trade-off between robustness and accuracy in the machine learning model, some of the latest papers suggest otherwise that trade-off is not inherent in machine learning technology as a whole.)

Therefore, positioning XAI as a ‘strategic investment’ and ‘a task that can contribute to the competitive advantage and performance of the company’ is very significant in justifying and actively introducing XAI, rather than being recognized only as something the company has to adopt due to external pressure factors (e.g., in response to regulatory policies or customer complaints).

Implementation of Machine Learning System Optimized for Differentiated Goal of Specific Industry and Company

The approach to just ‘check’ and ‘understand’ why the current artificial intelligence model outputs certain values is not good enough to maximize the business impact of an artificial intelligence system. Typically you end up using XAI techniques in a siloed manner or a cookie-cutter approach when you actually have to view the use case at a broader level incorporating internal, external stakeholders and to apply XAI techniques to fulfill specific requirements to improve business value from this artificial intelligence system.

If you apply XAI techniques with the context of business objectives in mind, XAI can be a great tool (not a burden) to leverage to achieve not only ‘trust’ but also ‘business performance’ at the same time.

Let’s have a look at a case study from Element AI with its client:

Toyota, a Japanese giant automotive manufacturer, has a number of primary suppliers. One of them is Aisin Seiki, a very big supplier of various vehicle parts such as powertrain, chassis, and doors.

In 2017, Aisin Seiki formed a team of data scientists, machine learning engineers, and researchers to establish a vision-based (using U-Net) deep learning system to determine whether the weld of its products is ‘normal’ or ‘faulty’. Although the lab performed well, with an accuracy of the model over 90%, Aisin Seiki’s client Toyota found FN (False Negative; recognized as a ‘normal’ product by Aisin Seiki and delivered to Toyota, but later found ‘faulty’ in Toyota site) an important issue making it difficult for Toyota to trust Aisin Seiki’s artificial intelligence system and to optimize Toyota’s working capital related to automotive parts from Aisin Seiki. Thus, Aisin Seiki was given the initiative to dramatically reduce the number of FN and establish the system for continuous quality improvement.

Through a joint research project with Aisin Seiki and Element AI, we reviewed a number of XAI techniques and the processes for incorporating them within the organization’s quality inspection system.

What’s noteworthy here, again, is that the ‘review’ and ‘selection’ of XAI techniques is specifically based on the strategic requirements to build ‘trust’ between Toyota and Aisin Seiki and to gain ‘working capital reduction’ impact on both companies’ books.

The joint team concluded that the combination of ‘Influence Function’ and ‘Decision Uncertainty’ techniques is the appropriate tool to meet the requirements given to the team, and built an additional layer of explainability onto the existing artificial intelligence model. As the result, the joint team was able to improve Aisin Seiki’s welded parts quality management system to contribute to the ‘strategic mandates’ of both Aisin Seiki and Toyota.

XAI : a system for continuous upgrade of deep learning products in the context of business performance

Closing Remarks

As you all know, lots of companies have done, still are doing many experiments with artificial intelligence technologies to improve business processes, to develop new products and services. I think it’s fair to say the focus of these experiments, in many cases, was to make sure the ‘performance’ of the artificial intelligence model could ever reach the ‘acceptable’ level from a ‘simple measure’ perspective.

In the future, the role of XAI as a tool for ‘universal adoption of AI’ will get bigger and bigger to help us understand the ‘why’ and ‘how’ of artificial intelligence models’ inner workings.

Furthermore, the companies that realize the potential usage of XAI as a tool for strategic differentiation and combine their domain knowledge and capabilities with the XAI techniques to drive business performance improvement, will be able to transform themselves into ‘true AI-First’ companies who are ahead of the competitors in the artificial intelligence era.