Awesome AI papers: Make machines less sexist

Rowen Lee
Nurture.AI
Published in
4 min readApr 23, 2018

This article is part of a weekly series of AI paper summaries. Check out more at the nurture.ai medium publication or the official nurture.ai website.

Model architecture

Motivation of paper

One common problem in prediction tasks is bias amplification, in which bias is learned during training and amplified during test time. This is a concern when the bias involves classes like gender and race. A solution to this problem has been discussed in a 2017 paper, where an overwhelming proportion of women is predicted to be involved in a cooking activity, even though images of men cooking are present. The paper proposed a solution that requires the gender distribution in the training dataset to be equal to that of the test dataset. This is impractical as one might not have control over test data.

Insight of paper

This paper focuses on gender biases in image captioning tasks where input images consist of a person with accompanying objects. The writers are of opinion that, image caption generators should predict gender-specific words like “man” or “woman” based on the person’s appearance and not the accompanying objects of the image.

How insight was harnessed

The authors introduce an Equalizer model, which adds two complementary loss terms to an existing image caption model called Show and Tell. The loss terms added are the Confident Loss (Conf) and the Appearance Confusion Loss (ACL). The former helps increase the model’s confidence when the gender of a person is obvious in an image. The latter encourages the model to be confused when the person’s gender is obscured and opt for gender-neutral terms.

Results

The models below are trained on images containing people and are compared by their error rates when describing men or women in test images.
(1) Equalizer with standard cross entropy loss, Conf and ACL;
(2) Equalizer without Conf;
(3) Equalizer without ACL;
(4) Fine-tuned Show and Tell model (baseline model);
(5) Baseline model trained on equal proportions of women and men images;
(6) Baseline model where loss function is multiplied by a constant factor.

The first model achieved the lowest error rate during testing. Furthermore, its predicted gender ratio is most similar to the ground truth.

Authors of this paper also used a Grad-CAM technique to determine which parts of the image contributed most to the generation of gender-specific words. It was discovered that the equalizer tends to focus on the person’s appearance rather contextual information when predicting gender. This means the equalizer is “right for the right reasons”.

Industry Implications

Models like the Equalizer improve the accuracy of image captioning tasks, they also help eliminate gender stereotypes that humans subconsciously subscribe to. This ultimately contributes to creating a future that upholds diversity, inclusion and equality.

Questions left open

Authors of this paper claim that the Equalizer is general enough to be used in alternative frameworks. One question that follows this is whether the Equalizer could be extended in eliminating ethnic bias. Also, if the Equalizer can be adapted to recommender systems, how should it be designed to avoid biases while recommending products? More importantly, will it be profitable to do so?

Thought Provoking Questions

  1. Research has shown that humans are naturally prone to racial and gender prejudices. Will we be less so if we successfully create models that are free from the prejudices?
  2. Datasets are easily subjected to biases. For example, a collection of images depicting programmers might have an overwhelming proportion of males. Should we strive to balance out gender proportions in these datasets, or create models that are less prone to gender biases?
  3. Is it acceptable to create models that infer a person’s gender based on the colour or type of the person’s clothing? Will your opinion change if making such inferences increases the accuracy of model predictions?
  4. The paper mentioned that some training image descriptions, which were written by humans, already contain gender bias. For example, human annotators described images of a person in a snowboarding costume as a man, even though the person’s face was obscured. Could it be that human stereotypes are to blame for creating gender biased models?

Read the full paper here.

Interested to read more? Head over to nurture.ai to view more weekly paper summaries and discuss interesting question left open by the paper here.

Rowen is a research fellow at Nurture.ai. She believes the barrier to understanding powerful knowledge is convoluted language and excessive use of jargons. Her aim is to break down difficult concepts to be easily digestible.

--

--