Indirect Bias support in IBM Watson OpenScale

Manish Bhide
Trusted AI
Published in
4 min readDec 3, 2020

Fixing Bias is very easy — all I need to do is make sure that the model does not make use of features such as Gender, Ethnicity, etc. If the model is not aware of Gender, Ethnicity, etc., then it cannot be biased on these attributes.

At first glance the above argument looks reasonable — however, nothing could be farther from truth. Unfortunately a lot of enterprises have learnt this the hard way. If you look at any of the recent issues related to bias in AI models, you will find that most of the models did not have ethnicity or gender as a feature but still they ended up exhibiting bias! This is because there could be other features used by the model which have a correlation with these attributes. E.g., Zip code and Income can have a strong correlation with Ethnicity. This might sound obvious, but correlation can creep up in unexpected and interesting ways some of which are hard to detect. One such example is that of the feature “Has_car” which signifies whether the person has a car or not. This is correlated to Ethnicity.

Indirect Bias Problem

The standard fairness metrics assume that the fairness attributes such as Ethnicity, Gender, Age, etc., is used as one of the features of the model. However, many enterprises have an internal rule that mandates that such attributes should not be used as features of a model. This brings in an interesting problem — how will we measure fairness if the fairness attribute is not one of the features of the model? Detecting bias on attributes which are not used as features of the model is called as Indirect Bias. In this blog post we provide an outline of how we support Indirect Bias detection capability in OpenScale. Detection of Indirect bias is a two-step process:

- Correlation Identification

- Indirect Bias detection

Correlation Identification: In the first step the user needs to provide OpenScale the training data used to build the model. This data needs to be augmented with the fairness attribute value as well. E.g., Consider a scenario where a user is building a credit risk model to predict if a loan application is risky or not. When building the model, the data scientist makes use of training data which does not include the Gender of the person. In order for OpenScale to detect Indirect Bias, what the data scientist needs to do is to find the Gender for each record in the training data and add that as an additional column to the training data. This augmented data needs to be provided to OpenScale. Please note that the model will still be built without using the Gender — it’s just that OpenScale needs information about the Gender for each record in the training data. Once this information is made available to OpenScale, it automatically finds correlation between the Gender and one or more features of the model.

Indirect Bias Detection: The second step is that of the actual indirect bias computation. As one would recollect, OpenScale needs access to the payload data which is the input and output of the model at runtime. If we consider the example of the credit risk model mentioned earlier, it does not make use of Gender as one of the features of the model and hence the Gender value will not be part of the model payload. However, in order for OpenScale to detect Indirect Bias, the user needs to send this additional Gender column along with the payload data. Thus OpenScale now has information about the Gender for each record which has been scored by the model (even though the model did not make use of the Gender when making the prediction). In addition to this, OpenScale also has information about the correlation that it has found between the Gender and one or more features of the model. Once we have the above information, we make use of this data to find the fairness of the model using the same perturbation based technique outlined in our earlier blog post.

In summary, OpenScale uses an innovative correlation identification based technique to monitor and detect bias for models which do not make use of the fairness attribute as one of the features of the model. This technique is better than Counterfactual fairness as OpenScale automatically finds the correlation between the features —something that needs to be done manually by a domain expert in Counterfactual Fairness based approach.

--

--