Unboxing Complexity: Black Box Networks Demystified

Aastha Gupta
GDSC VIT Vellore
Published in
11 min readDec 15, 2023

You may often hear people refer to neural networks as a ‘black box’, but what does the concept involve? Is it a situation where you are unable to completely understand the operations and functionality of a neural network? Or does it signify a lack of knowledge regarding how the weights are calculated? Or perhaps something entirely different (maybe something related to solar cookers or the black hole mystery)?

NEURAL NETWORKS

Deep learning is built on multiple layers of neurons and neural networks. It has achieved high demand and popularity in fields like image and speech recognition, natural language processing, and reinforcement learning. Large datasets, advanced hardware, and parallel computation have made deep learning more accessible. Open-source deep learning frameworks like TensorFlow and PyTorch have lowered the entry barrier for researchers and increased their value in various industries, such as healthcare, finance, autonomous vehicles, image and speech recognition, and natural language processing, contributing to their popularity.

The architecture of neural networks draws inspiration from the human brain, and its core aspect is to mimic the mechanism by which the human brain perceives information. You can compare it with a newborn baby learning to walk; neural networks stumble and refine their predictions, much like toddlers testing out those wobbly first steps. They absorb information from their surroundings, creating a mental roadmap.

Image taken from DevSkrol

The network architecture consists of weighted connecting links and nodes, which fit during the training process to become trained neural networks. The predictions are made in the output layer after the data enters the input layer and moves through hidden layers with activation functions. During the training process, optimization and backpropagation algorithms are used to minimize the difference between the predicted and actual values.

Once the neural network is trained, it can forecast fresh data. Input processing, weight modification, activation functions, loss computation, and iterative training are crucial phases of the training process.

A variety of neural networks exist, leading to many real-world applications for them.

But what causes a neural network to be considered a black box?

A black box can be thought of as a system or an object whose internal operation lacks transparency. Imagine the neural network as a magician with a hat; inputs go in and outputs come out, but the actual nitty-gritty of how it works is hidden inside the hat.

In general, straightforward models with simple architecture and a smaller number of parameters, like linear regression, random forest, or decision trees, are easily interpretable without additional explanation mechanisms. On the contrary, deep neural networks, featuring thousands or even millions of parameters, are often regarded as black boxes.

Image taken from ResearchGate

Each neuron performs a unique task; while the surface neurons extract features and preprocess the data, the inner ones are responsible for transforming the training samples from their original representation to a more relevant representation in the latent space. They face challenges in the learning process due to the poor performance of other neurons, which calls for compensation and adaptation. As a result, the neurons become interdependent and form a tangled web of connections and relationships.

Key Parameters shaping the mystery of why neural networks

1. Complexity

A deep learning model often requires numerous layers in its neural network architecture. The interactions between these layers can result in a highly complex mechanism, which makes it hard for an external observer to interpret and comprehend the operations of a billion parameters, especially when dealing with extensive datasets.

2. Multivariate nature

Neural networks often operate in high dimensions. It can handle complex relationships involving multiple variables or features simultaneously. As the dimensionality of the data increases, it becomes harder to understand how changes in input features relate to changes in output.

3. Non-linearity

Using activation functions like sigmoid and softmax introduces non-linearity to the model. This allows the neural networks to learn complex relationships in data but also results in increased complexity and lesser interpretability of output generation.

4. Adjustments in the neural network

Adjustments in a neural network, representing changes in parameters during training, contribute to its black-box nature due to the complex and non-linear interactions among neurons. The opacity of these adjustments diminishes interpretability by obscuring the specific roles of individual parameters and reducing human understanding of the model’s decision-making process.

Some real-world examples, demystifying the neural network black box

1. Medical Diagnosis

Neural networks and machine learning algorithms are often used in the healthcare industry for medical image analysis and diagnostic tasks. However, the neural networks do not explain or justify the results, as the complexities of the connections are too convoluted to be understood or explicitly confirmed within a practical timeframe. This creates a state of ambiguity and questions the reliability of the results.

One of the instances was at Stanford Hospital, where in 2019, a neural network-based mortality prediction model based on deep learning and electronic health record data to anticipate patient mortality exhibited inaccuracies in mortality predictions. This raised concerns about its reliability and potential implications for patient care.

2. Finance

Deep learning and black box models are used in finance to analyze investment throughout the years, relating it to trends in financial markets. However, there have been several instances that have led to a black box blowup. In 2019, Apple Cards faced issues of gender discrimination in its credit card approval mechanism, which used a model based on neural networks and utilized financial and credit history data. There were reported cases of bias against female applicants, as the algorithm resulted in lower credit limits for female applicants compared to male applicants with comparable financial profiles. Ethical concerns were raised due to a lack of transparency in the model’s decision-making.

3. Recruitment and Hiring

Companies are now shifting to automated hiring systems for screening resumes and candidate selection. It eases the recruitment process for the employer, but the lack of transparency might result in biased decision-making and favoring certain demographic groups. In 2018, Amazon faced issues with its automated hiring system, which was trained on historical data predominantly representing male candidates and exhibited bias against female applicants.

4. Automobile industry

In 2015, the Volkswagen Emissions Scandal was one of the biggest black box blowups. A black box engine control software was employed to detect emissions tests in a diesel vehicle and alter the engine performance to meet the standards. Failure occurred when the software falsely started adjusting the engine’s performance, resulting in over $30 billion in fines and settlements.

With growing concerns about the opaqueness of black box models, mechanisms and algorithms were developed to enhance the interpretability and explainability of the model.

Explainable AI

Decoding the complexities of a black box model hinges on two pivotal aspects: interpretability and explainability. Building straightforward, interpretable models helps us examine the internal workings of models and offers insights into their functionality. At the same time, integrating advanced technologies to enhance explainability becomes essential when dealing with more intricate scenarios.

Explainable AI is one of the most commonly used tools and plays a crucial role in decrypting models and enhancing transparency. LIME (Local Interpretable Model-Agnostic Explanations) and SHAP (SHapley Additive exPlanations) are two popular techniques in the field of Explainable AI.

Image taken from Medium

LIME (https://lime-ml.readthedocs.io/en/latest/) selects a specific instance and produces altered versions of it. These modifications are meant to simulate different variations of the input data. It then assigns weights to the modified instances based on their proximity or similarity to the original instance. More similar instances receive higher weights, and less similar ones receive lower weights. A locally interpretable model, often a simple one like linear regression, is trained on the modified instances with their corresponding model predictions and weights. This model is trained to approximate the behavior of the complex black-box model in the vicinity of the selected instance, providing explanations for the prediction made by the black-box model for the selected instance. The coefficients of the interpretable model represent the importance of each feature in making the prediction. The explanation generated by the locally interpretable model can be visualized in various ways, such as feature importance plots or textual explanations, to help users understand why the model made a particular prediction for the chosen instance.

Image is taken from Towards Data Science

SHAP (https://shap.readthedocs.io/en/latest/) is another approach that is based on cooperative game theory and the concept of Shapley values, providing a consistent and theoretically grounded method for attributing predictions to input features. It starts by defining a baseline prediction, often the average prediction of the model on the entire dataset. The goal is to understand how each feature contributes to moving from the baseline prediction to the model’s output for a specific instance. For each feature and instance, SHAP performs permutation shuffling. It calculates the difference between the model’s prediction with the original feature values and the prediction with the feature values randomly permuted. Shapley values are calculated based on the idea that each feature’s contribution to the difference in predictions is its Shapley value. These values represent the fair distribution of the model’s prediction change among the individual features. It then decomposes the model’s output for a specific prediction into contributions from each feature. This provides a clear understanding of which features push the prediction higher and which ones pull it lower. SHAP values have a useful property called the summation property. The sum of SHAP values for all features plus the baseline prediction equals the model’s prediction for a specific instance. It can be used to provide both global and local explanations. Global explanations highlight feature importance across the entire dataset, while local explanations focus on a specific instance. The SHAP values can be visualized in various ways, such as summary plots, waterfall plots, or force plots, to make them easily interpretable for users.

IMPLEMENTATION

Notebook

We’ll be using a simple ANN model to predict the churn of bank customers. We are using a dataset that contains customer information, including credit score, geography, gender, age, tenure, balance, number of products, credit card status, activity status, estimated salary, and an indication of whether the customer exited the service.

Model

We’ve used a Keras sequential model for binary classification. This model is a simple neural network with one hidden layer of 20 units and ReLU activation, followed by an output layer with a sigmoid activation for binary classification. Trained using the Adam optimizer and binary crossentropy loss, it aims to predict accurately based on a 10-feature input.

Prediction Function

The prediction function is used by LIME to generate local explanations.

LIME installation

We install the lime library using: !pip install lime and import lime, and its tabular instance:lime_tabular

We use the LIME library to create a LimeTabularExplainer instance. Providing the training data (x_train), the prediction function (predict_fn), the mode of the model (classification or regression), training labels (y_train), and feature names.

We select a specific instance from the dataset that we want to explain. This instance is typically one for which we want to understand the model’s decision-making process. The explain_instance method of the explainer generates a local explanation for the chosen instance. Specify the number of features to include in the explanation (num_features).

The results include prediction probabilities for two instances, with conditions influencing predictions such as the number of products and customer activity status. Interpretation rules highlight feature-based conditions affecting predictions, and feature importance analysis emphasizes the significance of NumOfProducts and Gender in the model’s decision-making process. This output enhances the interpretability and transparency of the model’s predictions.

SHAP implementation

We install the SHAP library using: ! pip install shap and import shap in our notebook.

We create a SHAP Explainer. We are using DeepExplainer for deep learning models, but different explainers may be used for other model types.

The parameter data=x_train_np[:100] specifies the data on which the explainer will be based. SHAP values are computed based on a background dataset, and this dataset is used as a reference to understand the impact of each feature on the model’s predictions. The [:100] indicates that the explainer is built on the first 100 samples of the training data (x_train_np). Additivity in the context of SHAP values refers to the property that the contributions of individual features to a model’s prediction should add up to the total prediction, but setting it to False can be useful in certain cases where additivity is not strictly required.

shap.maskers.Independent specifies the type of masker to use. A masker is a method to represent the background data, and Independent is a common choice. It assumes that each feature is independent of the others, simplifying the computation of SHAP values.

The output from SHAP (SHapley Additive exPlanations) includes visualizations such as summary plots and dependency plots, along with numerical summaries of feature importance. These outputs offer insights into the contribution of each feature to model predictions, facilitating the interpretation of machine learning models. The summary plot highlights the relative importance of features, while individual instance summaries provide detailed information for specific data points. Dependency plots illustrate how predicted outcomes change with variations in specific feature values. The output aims to enhance model interpretability by explaining the rationale behind individual predictions.

The Enigma of Scale in AI and the challenges ahead

With the growing demand for neural networks and the billions of parameters associated with them, the decision-making process becomes increasingly opaque, and the interpretability is reduced. The sheer complexity of these billion-parameter networks can overwhelm traditional Explainable AI techniques. As the models become more complex, the interpretation generated by Explainable AI methods may lose fidelity and fail to capture the true details of the model’s decision-making process. The granularity required might be compromised, leading to overly simplified or incomplete explanations. Simultaneously, the computational burden might hinder real-time interpretability, with a rise in scalability issues.

This challenge requires ongoing research efforts to strike a balance between the advantages offered by large-scale models and the imperative for interpretability. Attention mechanisms, layer-wise relevance propagation, and model distillation can be incorporated to enhance interpretability without sacrificing the benefits of complex neural architectures.

Conclusion

The Enigma of Scale gives a reality check about increasingly sophisticated AI models and the trade-off between the extent of scale and the degree of interpretability. The call for interpretability becomes paramount in contexts where human understanding of AI processes is crucial for trust, accountability, and ethical considerations. In response to these imperatives, continuous research initiatives have become essential to unravel the complexities inherent in billion-parameter networks. The pursuit of a deeper understanding of these intricacies not only facilitates the development of more interpretable AI systems but also paves the way for innovations that harness the benefits of scale without sacrificing transparency.

In summary, The Enigma of Scale is a thought-provoking exploration of the multifaceted challenges surrounding advanced AI models. It emphasizes the need for a careful and ongoing examination of the intricate balance between scale and interpretability, acknowledging the real-world demands for transparent, understandable, and ethically sound AI systems in critical domains.

--

--