Designing Fair Human-Centric AI Systems

Published in

Nerd For Tech

8 min readJun 6, 2021

With the increasing number of AI models deployed across multiple sectors, there is a move towards designing ethical AI systems that can be explained. The reason for wanting to explain models is to prevent any form of discrimination or bias that may result in some groups of people receiving an unfavourable outcome based on attributes like their gender, race, age or other demographics. Also, models can sometimes learn junk features and break down when applied to new unseen datasets, so it’s important to understand that a model is learning an outcome for the ‘right reasons’.

For example, Amazon had to scrap a recruitment engine that was filtering out female candidates (1). Though technology roles in most industries are still largely male dominated, choosing a candidate on the basis that they are male over an equally qualified female alone can be considered a form of AI bias. Another example is Apple’s credit card, which was discriminating against females by providing a lower credit score to that group (2).

Should AI models be explained?

The answer to this question depends on the use case, desired outcome and who will be impacted by the AI decision. It is also not that simple to explain AI models all the time. Here we list a few challenges in AI Explainability.

Models can be quite complex: Deep learning techniques are increasingly being leveraged to handle unstructured datasets like text, images and audio. When your dataset is not in a tabular format, interpreting what is inside a model can be quite hard. Even tabular datasets have many combinations and permutations of features rendering models unintelligible to humans. Furthermore, if we train a convolutional neural network to recognise objects, we may be able to visualise some of the inner layers of the network as shown above but interpreting them can be hard.
A human-centered approach towards AI design is needed where the purpose and outcome of the model is key: What does it mean to train a model with a human or end-goal in mind? Misclassifying objects in a fun application that tells you a photo is a cat vs a dog vs a car does not impact any person, so explaining the model could be fun but not essential. However, if an AI model receiving data from a CCTV camera fails to detect a knife or any other weapon the consequences of that can be fatal. Similarly, assigning a credit score or discriminating someone from a hiring process does have an impact on someone’s life and should be explained. Note that explaining a model is not the same as just exposing all the underlying contents of a model.
Fairness goals must be well understood when designing an AI system: How ‘fair’ the model is will be determined by the human outcome we are trying to attain. To understand this via worked examples, the IBM Fairness 360 toolkit provides some useful definitions of fairness (4). Are we trying to maximise individual fairness or group fairness? If maximising individual fairness, the AI model should ensure that similar individuals receive similar treatments and outcomes. However, if maximising group fairness the goal could be to enhance the chances of success of unrepresented groups. There’s a subtle difference between the two outcomes.
Model Robustness and safety are necessary: With any new product that goes into the market, stress tests under different conditions to ensure the safety of the product are needed. With AI models, this means we should be training models in different contexts & environments to ensure they don’t break and have independent validators reviewing them before deployment. Also, datasets can be historically biased (next section explains bias) so such cases should be controlled when the outcome of the model has a significant impact on someone’s life.

What are the different types of bias in AI models?

Kaggle has recently published an Introduction to AI Ethics course, going deeper into many of these questions (5,6).

Explaining different types of bias in AI models. See (7) for original paper

Historical bias: When input sample data is flawed resulting in models that could select one group more favourably over the other. E.g. Historically lower percentages of females vs males in STEM careers.

Representation bias: Can happen if a class does not have sufficient instances in the training data. As described in the kaggle tutorial (5) for example, > 65 year olds have lower smart phone usage rates than younger crowds. Thus, models trained on smart phone data may only represent preferences of younger groups and exclude those older individuals. In medical diagnosis, we may be looking to detect patients with a rare genetic condition using AI, but such instances would be very rare in the data, so the model would detect the healthy ones accurately and perhaps classify the diseased ones as healthy due to lack of representation in the data.

Measurement bias: Occurs with proxy data when you cannot measure a quantity directly. For example, if a hospital bill alone is used as a proxy to predict the severity of a health condition for a patient that visits a hospital, we may be excluding minority groups of people that have underlying conditions but do not trust or cannot afford to pay health bills.

Aggregation bias: If we are measuring the accuracy of a model predicting who is likely to default on a loan and by baseline, only 5% of individuals default, the majority model will be 95% accurate. Thus, the recall or accuracy of each class matters as aggregating both quantities would not reflect the true performance of the model.

Evaluation bias: A model may perform very well with known datasets but breakdown if applied to new unseen data. E.g. a face detection system that identifies faces of a single ethnicity well, but breaks down if exposed to a new ethnicity.

Deployment bias: An algorithm may work very well for a given use case but breakdown if deployed with a different intended purpose upon deployment. For example, a model trained to predict who will buy a particular product may not be suitable for predicting who will default on a loan, though there may be some overlap or correlation between the two use cases, models can break down when deployed for an unintended purpose.

How does fairness apply to loan applications? (Exploring Analytics Vidhya’s Loan Dataset)

Given the short space in this article, the full exploration of the dataset can be found here. In this example, we leverage a house loan dataset from a tutorial published by Analytics Vidhya. Though it is unclear if this data is real or modelled, it can help us understand some examples of Fairness & AI Explainability.

Preview of Analytics Vidhya’s House Loan Dataset, posted in Kaggle

As we can see from the example, the dataset contains many demographic attributes. Such attributes include marital status, gender, education level, number of dependents and so on. The Loan Status flag determines if the loan was granted to that individual. A closer exploration of the data shows that approximately ~65% of loans are approved. In the figure below, we show a general profile of Loan applicants.

General data exploration for Analytics Vidhya’s house loan dataset’s categorical inputs

The data would contain some representation bias as historically less females apply to loans than males. Also, the majority of loan applicants are graduates and most of them are from urban or semi-urban areas rather than rural. It is useful to overlay the outcome of the loan over these features to see if there are any skews in the data. So we can look at the proportion of approvals within a group. As shown in the data, there is a slightly higher approval rate within the graduate group and also the semi-urban and urban groups. The reason could be a combination of historical and representation bias, but also driven by other factors such as the loan amount, overall risk or absence of credit history for some applicants.

Slightly higher proportion of loan approvals are observed in the graduates group

More loan approvals are coming in from applicants in urban and semi-urban areas rather than rural

Going into details and correlations between features would be a good step to understand the data better. However, taking a step back and thinking about it from an end-user’s perspective: if a bank were to deploy an AI agent to decide who gets the loan and the applicant questioned the outcome, how would the bank explain the output to the loan applicant?

In this example I trained a decision tree classifier to predict if any given applicant would get a loan. It wasn’t the most accurate model I could have trained, but if I were to explain its output to an end user, not even the bank’s employee would interpret it if I were to show them the diagram below to explain the output of the model.

Showing which features are underlying a model doesn’t mean it’s explainable…

Even though the contents of this model are exposed, converting this into a consumable explanation which is understandable for humans is still something to think through. Another way we could visualise a model output is by looking at SHAP values. As shown in the example below, models trained with protected attributes should be inspected thoroughly, given that in this case, the machine learning model rejected the loan. One of the key features for the rejection was that the person was unmarried. Though there were probably more important factors such as the loan amount or the income threshold, if someone were to interpret this and say… “the model rejected your application because you’re not married”, it wouldn’t make sense right? Thus, when designing AI-powered decision support systems, the user should always be at the forefront to determine the level of explanation that the model needs.

SHAP value for an applicant that the model rejected for a loan. The true label in this case was an approval.

Conclusion

Though this is a dummy example as there was a misclassification error, the example shows that ‘why’ is as important as ‘what’ when keeping an end user in mind during model design. In addition, model explainability does not always equate to ‘understandability’. There are cases when model explanations are not needed, but there are also cases when they are needed. Thus, it is important to understand the difference between the two.

To conclude, keeping robust AI governance on models and educating AI practitioners is necessary. If bias is the issue, many bias mitigation approaches such as IBM’s Fairness 360 toolkit provide ways in which imbalanced datasets with historical and representation bias can be dealt with. Secondly, coming up with more visually appealing and human-centric approaches to explaining models would be crucial for increasing trust in the use of AI tools for decision support. Finally, the consequence or materiality of the use case for the machine learning model should be determined upfront, before deciding which level of explanation is needed and to understand who the end user is who will be impacted by a decision.