Towards a Fair Classifier

Eliminating Discrimination Patterns in a Naive-Bayes Classifier

Published in

Towards NeSy

5 min readMar 16, 2023

This blog introduces the paper: Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns, AAAI 2020. Feel free to check it out directly here.

Introduction

Why is fairness cruicial to ML tools?

With the increasing involvement of Machine Learning and AI in our day to day lives, comes the big task of understanding the bias and discrimination that seep into these models through data. To cite a few, racial bias in policing (Mohler et al. 2018), recidivism prediction (Chouldechova 2017), gender bias in hiring (Lahoti et al. 2018), and credit rating (Henderson et al. 2015). The list is endless and is rapidly growing as we continue to integrate AI into more decision making avenues.

Fig1: The bias in Northpoints’s tool for recidivism. (from Larson et al. ProPublica, 2016)

Fairness literature has come up with several solutions to limit differential treatment of similar individuals and quantify a classifier’s dicriminatory nature through statistical measures. Some of the prominent approaches include,

Individual fairness (Dwork et al. 2012; Zemel et al. 2013)
Disparate Impact and Group Fairness (Feldman et al. 2015; Chouldechova 2017)
Counterfactual fairness (Kusner et al. 2017)
Relational fairness (Farnadi, Babaki, and Getoor 2018)
Equality of opportunity (Hardt et al. 2016)

Most of the explored approaches assume that all feature values are present at prediction time (complete assignment), which is seldom true in a real world scenarios. Such an assumption could miss several instances of biased behaviour that are observed with partial assignment of variables and thus can leave gaps while evaluating models for bias and fairness.

This is the sweet spot where Discrimination Patterns (Choi et al.) comes in. It introduces a new notion of fairness which asks whether the model shows patterns where the decision label for an individual, characterized by (partial) feature observations, changes significantly when one or more sensitive attributes are disclosed.

What are Discrimination Patterns?

Assuming d being a decision variable, x be an assignment from sensitive attributes and y represents attribute assignment for similar individuals from a group, the Degree of Discrimination of pattern (xy) is defined as :

The joint assignment of (x,y) forms a Discrimination pattern if
|ΔP,d(x, y)| > δ, where δ is a chosen fairness threshold.

For more details on Discrimination Pattern, please read my previous blog
Discrimination Patterns — A fine-grained notion to quantify fairness.

Main Idea

Finding Discrimination Patterns in a Probabilistic Clasiifier

Although the notion of δ-fairness can be applied to any distribution, discovering all discrimination patterns is a hard problem as a given distribution can have exponentially many discrimination patterns.

Naive-Bayes Classifiers, with their strong independence assumption, allow for a tractable algorithm to discover and eliminate discrimination patterns.

First method to find all discrimination patterns is to naively traverse all possible patterns and check if they form a discrimination pattern. This is still an exponential complexity task and thus we use a Branch and Bound technique to converge the search faster.

An Upper Bound function bounds the degree of discrimination achievable by observing more features after xy while excluding features E, is used to implement the branch and bound search. Choi et al. discusses a linear time algorithm to find upper bound for given x,y.

How important is a particular discrimination pattern?

Although all discrimination patterns are worth paying attention to, but the ones that have higher presence in the dataset need to be given higher priority as they are more likely to be observed in test scenario. To achive this we rank the Discrimination patterns using the Divergence Score, the minimum Kullback-Leibler divergence between current distribution hypothetical distribution Q that is fair on the pattern xy and differs from P only on the assignments that satisfy the pattern (namely d,xy and notd,xy).

Formally —

The divergence score of pattern xy where P be a distribution over D∪Z, Z be the features, S be the sensitive features and, x and y be joint instantiations to subsets X ⊆ S and Y ⊆ Z \ X, respectively.

Parameter learning for a fair classifier
Using this ranked set of Discrimination Patterns, we find the top-k patterns which are then used during the parameter learning phase of the classifier to eliminate the high-risk dicrimination patterns. An algorithm for constrained parameter learning has been discussed in Khosravi et al. 2019.

Using the discussed algorithm, it can be shown that |ΔPθ,d(x,y)| ≤ δ for a threshold δ ∈ [0, 1] iff the following holds :

The main theme of the algorithm is to iterate between parameter learning and constraint extraction, where constraints are derived from the top-k discrimination patterns at that iteration.

That is, we iterate between parameter learning and constraint extraction, gradually adding fairness constraints to the optimization. The parameter learning component is as described in the previous section, where we add the constraints for each discrimination pattern that has been extracted so far. For constraint extraction we use the top-k pattern miner. At each iteration, we learn the maximum-likelihood parameters subject to fairness constraints, and find k more patterns using the updated parameters to add to the set of constraints in the next iteration. This process is repeated until the pattern miner finds no more discrimination pattern.

Although this is an exponential complexity algorithm, it is proven empirically that the search converges in much fewer iterations.

Results from Choi et al. —

Log-likelihood and the number of remaining discrimination patterns after each iteration of learning on COMPAS dataset with δ = 0.1

While comparing an unconstrained maximum-likelihood model and a model in which the sensitive variables are independent of the decision variable (NB assumption) with the remaining parameters learned using the max-likelihood criterion (independent). We observer that these models lie at two opposite ends of the spectrum of the trade-off between fairness and accuracy. The δ-fair model falls between these extremes, balancing approximate fairness and prediction power.

Log-likelihood of models learned without fairness constraints, with the δ-fair learner (δ = 0.1), and by making sensitive variables independent from the decision variable.

References

Choi, Y., Farnadi, G., Babaki, B. and Van den Broeck, G. 2020. Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns. Proceedings of the AAAI Conference on Artificial Intelligence. 34, 06 (Apr. 2020), 10077–10084. DOI:https://doi.org/10.1609/aaai.v34i06.6565.

Pasha Khosravi, Yitao Liang, YooJung Choi, and Guy Van Den Broeck. 2019. What to expect of classifiers? reasoning about logistic regression with missing features. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19). AAAI Press, 2716–2724.