How We’ve Taught Algorithms to See Identity

Constructing Race and Gender in Image Databases for Facial Analysis

Morgan Klaus Scheuerman
Published in
4 min readMay 19, 2020


A portrait of a Black woman looking into the camera lens. An animated gif switches between the labels for “Gender” and “Race.
GIF by Morgan Klaus Scheuerman. Original image by Ricardo Velarde on Unsplash.

This blog post summarizes the above paper about how race and gender are operationalized in image databases used for training and evaluating facial analysis systems. This paper will be presented at the 23rd ACM Conference on Computer-Supported Cooperative Work and Social Computing, a top venue for social computing scholarship. It will also be published in the journal Proceedings of the ACM (PACM). A free PDF version is located here.

Race and gender are a part of our everyday reality. We all experience the visible external features associated with race and gender when we interact with the world. As a result of these interactions, we all also hold our own affinities with race, gender, and a multitude of other identities. Race and gender have also always been incorporated into technical systems. From the U.S. Census to Facebook, we see technical representations of these identities everywhere. They have also now become commonly embedded into databases as features used to train machine-learning based algorithms. Perhaps the most salient example of this is in facial analysis technology, a computer vision method for analyzing information about human faces.

Within the last decade, facial analysis technology has quickly transitioned from a theoretical research problem to commercial reality. Facial recognition is already widely used; for example, embedded in technologies (e.g., social media, device locking) and employed by law enforcement. Facial classification is used to target marketing campaigns at specific demographics and to track physical consumer behavior inside stores. Historically, formally classifying race and gender has also resulted in harm; for example, the differential access to healthcare for marginalized genders and the targeted extermination of racial groups.

The outputs produced by facial analysis systems are premised on their training and evaluation data: the images used to “teach” the system what a subject looks like. In our study, we analyze race and gender in training databases from a critical discursive perspective. Specifically, we investigate how race and gender are codified into 92 image databases, many of which are publicly accessible and widely used in computer vision research.

To do this, we analyzed how race and gender are represented in image databases and how those representations are derived. What categories are employed to represent race and gender? What sources are being used to derive race and gender labels? How are database authors describing the annotation procedures for race and gender categories?

Facial analysis systems rely on the underlying utility, reliability, and accuracy of the ground truth data provided. We observed very few instances where database authors documented how they determined information about race and gender in facial images.

A table showing the findings for both annotations and annotation descriptions for race and gender in our dataset.
The above table shows a count of sources and annotations descriptions in explicitly annotated databases. Only 2 databases contained both sources and annotation explanations for both race and gender.

We saw vast inconsistencies across racial categories and methods for determining the race of a person in image data, with some databases employing three racial categories (e.g., MS-1-Celeb) and others employing seven (e.g., 10K US Adult Faces). Meanwhile, as previously documented in prior work, we found consistently binary categories for gender.

At the same time, we found that the majority of image databases rarely contain underlying source material for how those identities are defined. Further, when images are annotated with race and gender information, database authors rarely describe the process of annotation, such as who is doing the annotating and what visual cues are leading them to make specific judgments. This made it difficult, if not impossible, for outsiders to understand how identity-based decisions are being made when constructing databases.

We posit that the lack of similar engagement with ground truth race and gender data undermined the very purpose of image databases to be usable, reliable, and accurate. Further, it ignores the sociohistorical reality of race and gender, including its weaponization in historical classification systems. We thus observed race and gender categories — that are socially and historically complex — portrayed as obvious, static, and apolitical.

In our paper, we discuss opportunities for improving the current state of image database construction and documentation. By encouraging database authors to embrace a deeper engagement with their own positionality in constructing and annotating race and gender, we can make databases more useful and transparent to machine learning practitioners. By actively acknowledging the socio-historical context their databases stem from, we can attempt to avoid replicating the harms done by historical classification systems. By embracing more flexible forms of categorizing, like self-annotation of identity information, we can avoid overly simplistic binary categories. And by explicitly stating the limitations of use, we can help protect marginalized groups which might otherwise be harmed by the deployment of facial analysis technologies.

Morgan Klaus Scheuerman, Kandrea Wade, Caitlin Lustig, and Jed R. Brubaker. 2020. How We’ve Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis. Proc. ACM Hum.-Comput. Interact. 4, CSCW1, Article 58 (May 2020), 35 pages.