The Pitfalls of Learning Quickly: when Generalizing turns into Stereotyping
Recently, I wrote a blog post titled, “Building Machines that Learn and Think Like People”, based on a paper of the same name. In it, I discussed some key skills lacking in current machine learning models that enable humans to learn quickly by generalizing acquired knowledge. Reading the paper and writing the post got me thinking about how some of the skills that empower us to learn quickly through generalization may also lead us to stereotype.
In this post, drawing from the literature in cognitive science, I will argue that compositionality, learning-to-learn, causality, and inference enable us to generalize previous knowledge and speeden how quickly we learn. I will use playing the game of Super Mario as an example of when this enables efficient and accurate learning. However, I will argue that generalization works in the game of Super Mario because the characters and objects within it are relatively simple. Following from this, I posit that successful generalization within the game of Super Mario can be seen as successful stereotyping. I finish this post with a real-world example of a man being stereotyped as a terrorist to demonstrate that when characters and objects become more complex, this form of generalization or stereotyping leads to inaccurate inferences.
The Utility of Generalization
Cognitive scientists have shown that by infancy, humans can both differentiate harmful, helpful, and neutral behavior, and differentiate animate objects from inanimate objects using low-level cues. Now, imagine that you are playing the game of Super Mario. As you play, you use these skills to quickly differentiate animate objects from inanimate objects. Further, based on how animate objects interact with you, e.g. whether they lead to reward or death, you learn to predict whether they are harmful, helpful, or neutral. But you don’t only learn that objects are harmful, helpful, or neutral.
You learn rich representations for the objects you encounter.
For example, as you encounter different characters (say Koopa Troopa, Iggy Koopa, Dry Bones, etc.), you learn representations for each that richly detail what they are made of and how they relate.
This is part of what allows you to apply knowledge of encountered characters to new, unseen characters. In order to learn these rich representations, you employ learning-to-learn, compositionality, causality, and inference.
Learning-to-learn, compositionality, causality, and inference
Learning-to-learn is the idea that we apply previously learned knowledge to new learning situations. For example, as you learn to represent the Koopa Troopa character, you might reuse your knowledge of concepts such as mouths, eyes, faces, and shells. To combine these parts to represent something more complex, you employ compositionality. Compositionality is the idea that an object can be broken down (potentially infinitely) into its parts and their relations. That is, you might break down Koopa Troopa into its face, its shell, and its beak, and then represent Koopa Troopa as some combination of these components and their relations.
Causality is the idea that you can model something with an abstraction of the real-world process that generated it. When you use compositionality and learning-to-learn to break down objects into their parts and relations, modeling the object as a creation of these parts can be seen as modeling them in a causal way. Continuing with our example, you might learn that Koopa Troopa characters are generated as compositions of more primitive parts such as eyes, beaks, and shells.
Causality is a very powerful modeling system, and can capture things beyond visual relations. For example, a causal model for species can capture genetic ancestry, i.e. which species came from which. As such, a causal model for characters in the game might capture that multiple characters come from the same general category — for example, “bad guys”.
In addition, causality can also be combined with inference. When there is some causal (or generative) process, inference is the process of going in the “reverse” direction. For example, as you learn the various causes for death in the game, you are creating a causal model for it — e.g., hitting Koopa Troopa, falling, etc. all lead to death. When you die and reason about the cause of your death, you are doing inference. Inference is powerful because it can accompany all the causal models you create, which, as shown by my examples above, can be quite diverse.
You constantly experience objects and learn to classify them. First by some primary categories, such as animate or inanimate, and then by other categories that you’ve learned (e.g. enemy vs. friend). In order to categorize and represent complex objects, you break them down into their components. And you can use all of this — the components you see and how well they fit into causal models you know — to infer category membership for the objects you experience. Inferring about object categories can then give you insight into which attributes you can predict for the objects you see. For example, you might see a variant of Koopa Troopa that has a red shell instead of a green shell, and you’ll break it down into its parts, and infer that this belongs to the Koopa Troopa category. This would then allow you to predict attributes and actions for this new character based on your learned representation for the Koopa Troopa character.
As another example, consider experiencing Koopa Troopa, Laikitu, and Hammer bro. As you learn to represent each, you might do so by building a causal model in which each is generated by parts and relations, and each comes from a more abstract character that contains a shell. Perhaps, since Bowser (a known boss) also has a shell, these are descendents of Bowser (much like genetic ancestry), and just like Bowser, each is harmful to you and is therefore given the attribute of “enemy”. You learn that the presence of a shell is a good predictor for whether an encountered character is an enemy. In the future, when you encounter Koopa Paratroopa, you infer that its attributes place it in the abstract “shell character” category and successfully predict that it is another enemy to avoid.
Generalization as Successful Stereotyping
Clearly, the ability to use compositionality and learning-to-learn to learn causal models, the ability to do inference, and the ability to use these things to generalize is powerful. However, as objects become more complex, creating accurate models for them increases in difficulty. When we look at how this form of generalization manifests for simple objects vs. complex objects, it starts to resemble stereotyping. Indeed, the story I’ve given of how you successfully generalized previous knowledge to predict behavior and attributes can be recast as a story of how you successfully accomplished this through stereotyping. In this example, stereotyping was clearly useful, but when does it begin to lose its utility?
In the story, you saw a few examples of enemies that had shells and quickly learned that shells were good predictors for whether a character was an enemy. Afterwards, when you saw new shelled characters, you immediately stereotyped them and predicted they were enemies with no more information than a few shared attributes. And in this unrealistic simple game of Super Mario, doing this was okay. Actually, it was beneficial! Unfortunately, characters in the real world, and especially humans, are far more complex. And while we realize this to a degree, at first encounter, too often our brains treat other human-beings as crudely as they treat the simple characters in video games.
The Colombian ISIS Member: An Example of Incorrect Stereotyping
Recently, I saw a video of a Canadian man threatening an immigrant that he feared was a member of the ISIS terrorist group (news story). The immigrant had a large beard, a foreign accent, and, was of a darker complexion. This is a somewhat canonical example of social stereotyping. What I find particularly interesting about this example is that the immigrant was neither muslim nor in any way connected to the social group with which he was being associated — he was Colombian.
Now, let’s try to understand what potentially happened. To aid in this analysis, I’ve created this handy-dandy diagram. With learning-to-learn, you re-use previously learned concepts when learning new concepts. For example, the Canadian man might have learned concepts for the “Middle-Eastern” social group, for beards, for skin-complexion, etc., which he might have re-used as he learned concepts for members of the ISIS terrorist organization. He then might have combined these concepts compositionally to represent ISIS members as a composition of traits such as being Middle-Eastern, being Muslim, etc. To do this, he might have built causal models in which
- the Muslim religion leads men to grow large beards,
- being Middle Eastern leads to having a darker complexion and having an accent when speaking english,
- and ISIS produces Muslim men of Middle-Eastern descent.
With these causal models for the world, upon seeing a darker man with a beard and an accent, the Canadian man might have inferred that
- his dark complexion and foreign accent meant he was middle eastern,
- his large beard meant he was muslim,
- and the combination of these things was sufficient to infer that he was a member of ISIS
Clearly, the world is not so simple where all (or even most) dark muslim men with beards are members of ISIS but this was enough information for him to make this inference and treat it as likely. (As a reminder, my analysis was speculative and hypothetical).
Part of what makes us powerful learners but also susceptible to the pitfalls of stereotyping is that we have an amazing ability to learn a lot from very little. A small number of examples or encounters can play a large role in how we model a social group. Unfortunately, treating a group like the small number of characters that has informed it can be inaccurate because it's difficult for a small number to be representative of an entire, potentially highly-diverse, group.
Following this logic, I argue that the problem of stereotyping complex social groups based on a small number of examples is made particularly worse by television. For example, television depicts biased portrayals of certain ethnic groups — both in the news and in fictional shows. There has been a lot of controversy surrounding the ways in which black and white men are portrayed differently when they have committed a crime. Additionally, there has been a lot of criticism over the stereotyped roles of different social groups: the black thug, the latina maid, the asian nerd, the emotional woman, etc.
Our brains might mistakenly use the examples they see on television as they build their models for social groups. They then might use these television-informed models when trying to infer the social membership of people they encounter in the real world. It would be unsurprising to me if this is the case and I imagine such a problem would be exacerbated for social groups with which people have little real-world experience. In short, people might be doing inference using models learned predominantly through fictional (or potentially biased) sources on TV.
Stereotyping is a double-edged sword we wield.
It is both powerful and allows us to generalize small amounts of information but can also lead to inaccurate predictions when dealing with complex concepts. However, it is my hope that being aware of this cognitive phenomenon can help us mitigate its negative impacts.
I want to note that in this blog post, some/many of my claims about how cognitive science relates to social stereotyping have not yet been verified by research. Thus, my goal here is only to start a dialogue on social stereotyping that is rooted in cognitive science. I welcome both supporting and contesting opinions, and hope that this can turn into a fruitful discussion. Please leave thoughts, comments, and suggestions below. Thank you for reading.