Adversarial Gender Debiasing(AGD): Efficient Gender debiasing technique

Shalvi Desai
Clique Community
Published in
7 min readJul 10, 2020

Determination and removal of gender bias in face recognition model for the construction of transparent and robust model , to encourage fair and accountable predictions.

Table Of Contents

  1. Introduction
  2. Plausible reasons for gender bias
  3. Correlation based on PCA
  4. Adversarial Gender De-biasing (AGD)
  5. Performance Analysis
  6. Effect of Triplet Probabilistic Embedding (TPE)
  7. Conclusion
  8. References

1. Introduction

Since the introduction of deep convolution neural networks accuracy of face recognition models have shoot up significantly.Also the application of this technology has increased exponentially. Still we find some major issues in the current technology. Several research work has found that face recognition model shows bias towards particular race,gender or ethnicity. Gender is often found as most important facial feature while classifying the images. Thus it leads to a big impact in the performance of the algorithm.We are trying to propose an approach where we can remove gender bias by reasonably maintaining the accuracy of the model.

2. Plausible reasons for gender bias

To mitigate such issue let’s dive deep for better and fairer understanding.Basically gender biasing issue implies achieving more accuracy for a particular gender, even though both of them are trained and classified by the same model.Let us analyse some plausible reasons for gender biasing:

2.1 Imbalanced or skewed dataset

2.2 Implicitly encoding gender information

2.1 Imbalanced or Skewed dataset

We take two different datasets Network A and Network B, train our Resenet-101 on it and compare its performance before and after balancing the dataset.

Network A : Resnet-101 trained on MS1MV2 [6] with Additive Angular margin (Arcface) loss [7]. There are 59,563 males and 22,499 females in this dataset.

Network B : Resnet-101 trained on a mixture of UMDFaces[2] , UMDFaces-Videos[3] and MS-Celeb-1M [4], with crystal loss[5] . There are 39,712 males and 18,308 females in this dataset.

Initially we compare original dataset’s male-male and female female pairs.In the next step we extract training example in different ratio’s of female and male labels from Network B and compare the obtained results.We first try out 50% male-50% female , then we use 10% male -90% female ratio. We obtain the following results:

Performance Statistics

We can observe that male-male face verification is more accurate then female-female pair from figure (a). In figure (b) the red solid line stroke and dotted line stroke still has a huge gap , which clearly indicates balancing the datasets does not help to improve the performance at all. Instead when we use the ratio 90 % female and 10% male we see the significant improvement in the performance of the algorithm.

2.2 Implicitly encoded gender information

Moving on to second plausible reason is implicitly passing gender labels while training the model.To solve this issue we will try to make the model agnostic to the gender information.Before classifying we will try to remove the feature describing the gender of the entity.For this we have two approaches:

  • Correlation Based PCA
  • Adversarial Gender De-biasing

3. Correlation Based PCA

Our main goal is to remove gender specific feature from the dataset .This is a very naive approach for fulfilling our motive.We first compute the eigenspace of the features and then isolate the eigenvectors encoding the gender specific information and after removing these vectors we transform the rest of the features using the remaining subspace.

4. Adversarial Gender De-biasing(AGD)

Skeletal Structure of AGD model

Let me first introduce all the elements of our algorithm. Initially we will pass our dataset through a pre-trained neural network, the output of which will be fed to our model M denoted by Fin. This model consist of a single PRelu layer.The output of this model M denoted by Fout is sent to the ensemble B consisting of K Gender Predicting Models(GPM).Here our motive is to train our Model M in such a way that whenever it gives out the feature fout to ensemble B , gender prediction probability should come out to be 0.5 , indicating that it is not able to discriminate the gender on the basis of feature passed by the model. Then these features are said to be de-biased completely.These features when passed to classifier it will verify the faces without any bias and we would be successful in our motive.Now let us discuss how to train our model on the basis of the feedback of our ensemble and classifier.For that we have defined loss function for each segment.Here Lbr is the total loss function defined for our Model M according to which their weights have to be updated.This loss is the combination of the loss of the classifier and debiasing loss of the ensemble. Lclass is loss of the classifier C , Ldeb is loss of ensemble B.

Lbr(φC , φM, φB) = Lclass(φC , φM) + λLdeb(φM, φB)

The basic idea is that we would like to penalize M, with respect to the strongest GPM, which it was not able to fool, so we keep on updating our Ldeb till our GPM doesn’t give 0.5 probability.Here as we can see Lclass propagates through the weights of classifier and weights of the model and Ldeb propagates through the weights of ensemble B and weights of the model. And the total loss Lbr is defined as the combination of the both Lclass and Ldeb. After applying these algorithms let us check out the stats of the result obtained:

Performance statistics

Above results are calculated for Network A and Network B mentioned earlier, and we can see how our technique has reduced the bias in our model.

5. Performance Analysis

While removing the gender predictability from the model as a part of repercussion , overall performance of the model decreases. Referring to the stats in above image we can analyse that in comparison to network A , performance of network B has degraded drastically.To determine the cause behind it , we computed an eigenspace for all the features present in network B using PCA. After that we calculated gender correlation of all the eigen vectors. We found a very high gender correlation of these feature vectors , which signifies a higher entanglement of identity and gender in Network B and than Network A . Thus performance of Network B is affected more than Network A after making the network agnostic to gender i.e application of AGD. Since it’s verification hugely depends on gender encoded information.

6. Effect Of Triplet Probabilistic Embedding

Improved performance of Network B after applying TPE on its features

The identity features in Network B are not directly used in the model , instead they undergo Triplet Probabilistic Embedding(TPE). TPE is learning of discriminative and low-dimensional features of the given inputs.This catastrophically improves the performance of the model .But with the increase in the overall verification performance the bias of the model also increases in comparison to the previous approach.

7. Conclusion

In my opinion face recognition model can be saved from gender bias without significantly harming the overall performance of the model. Also one should try to debias the impact of other dominant feature like race or ethnicity using different debiasing losses and combination of various algorithms and techniques to make the model more robust and generic.

8. References

[1] Prithviraj Dhar, Joshua Gleason, Hossie Souri, Carlos D. Castillo , Rama Chellapa. An adversarial learning algorithm for mitigating gender bias in recognition, 2020.

[2]Bansal, A., A. Nanduri, C. D. Castillo, et al. Umdfaces: An annotated face dataset for training deep networks. In 2017 IEEE International Joint Conference on Biometrics (IJCB), pages 464–473. IEEE, 2017.

[3]Bansal, A., C. D. Castillo, R. Ranjan, et al. The do’s and don’ts for CNN-based face verification. In Proceedings of the IEEE International Conference on Computer Vision, pages 2545–2554. 2017.

[4]Guo, Y., L. Zhang, Y. Hu, et al. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European Conference on Computer Vision, pages 87–102. Springer, 2016.

[5]Ranjan, R., A. Bansal, J. Zheng, et al. A fast and accurate system for face detection, identification, and verification. IEEE Transactions on Biometrics, Behavior, and Identity Science, 1(2):82–96, 2019.

[6]Deng, J., J. Guo, X. Niannan, et al. Arcface: Additive angular margin loss for deep face recognition. In CVPR. 2019.

[7]Deng, J., J. Guo, X. Niannan, et al. Arcface: Additive angular margin loss for deep face recognition. In CVPR. 2019

--

--