Computer Vision tells us how the presidential candidates really feel

Using expression recognition to analyze debate and town hall footage

Published in

Voxel51

8 min readOct 22, 2020

The presidential and vice-presidential debates over the last few weeks have been…. interesting. I think we can all agree that they have been some of the most heated debates yet. The presidential debate quickly turned into emotional arguments that you could clearly see in the faces of the candidates.

Image by Author: Using FiftyOne to visualize distributions of emotions from the debates

The goal of facial recognition systems is to find human faces in images or videos and then classify them by their identities or attributes (like sex, emotion, etc). Recent years have shown a massive boom in interest in facial recognition technologies, especially in places like China. Recently it is being used to assist with automatic contact tracing and mask detection during the pandemic [1, 2].

As we watched the debate, we thought it would be interesting to use state-of-the-art computer vision technology to quantify each candidate’s emotions. Expression recognition [4] is a field of computer vision where a machine learning model is trained to classify the emotion that is shown on an image of a face — that means we take an image of a face and have a computer program that can tell us whether the emotion on the face is, say, sad or angry. The brown box labeled “sad” or “angry” in the gif above is an output from an expression recognition model [3]. It’s pretty cool stuff!

More precisely, we are using the facial expression recognition system FER [3] which uses a model similar to the one from this post [4]. We then used the new open-source Python machine learning tool, FiftyOne [5], that we developed at Voxel51 to visualize the outputs of the model.

Emotion Distributions

The best way to analyze what emotions each candidate wore on their face is to look at the distributions of detected expressions throughout the videos. To do that, we first need to construct a dataset of images from the debate videos, process them with the expression recognition model [3], and then aggregate and visualize the results in FiftyOne [5].

Dataset construction

We used the videos and subtitles from the following recordings.

Presidential Debate (9/29/2020): https://www.youtube.com/watch?v=wW1lY5jFNcQ
Vice Presidential Debate (10/7/2020): https://www.youtube.com/watch?v=t_G0ia3JOVs
Biden Town Hall (10/15/2020): https://www.youtube.com/watch?v=9ZZzfrapNvo&t=9862s
Trump Town Hall (10/15/2020): https://www.youtube.com/watch?v=MIBkKOKvpn4

For each full debate video, we sampled roughly one from every twenty frames resulting in a few thousand images per person. We then cropped the video frame to contain only the face, which allows the expression recognition software to perform better. See the example below of the full-frame and then the cropped face of Joe Biden.

Cropping and preprocessing performed to create the dataset of images from the debate

The expression recognition model was then run on every image in the dataset and the top emotion detected in each image was tallied. Below are the distributions of emotion for each candidate calculated using FiftyOne across all images of them. The following code is an example of how FiftyOne [5], FER [3], and OpenCV [6] were used to generate the results:

Alternatively, you can download a zip of images from the debates from the following Google Drive link: https://drive.google.com/file/d/1Eg3eryS2OPJNjrLsWTU6hfU8ghWMFjsZ/view?usp=sharing

Then install FiftyOne and load the FiftyOne dataset with the following code:

See this example for more details: https://github.com/voxel51/fiftyone-examples/blob/master/examples/emotion_recognition_presidential_debate.ipynb

Emotion distributions of candidates left to right: Biden, Trump, Harris, Pence

We also performed the same experiment on footage from the competing town halls on October 15th, the date when the second debate was supposed to take place. Only Biden and Trump hosted town halls and the two distributions are below.

Emotion distributions from the Biden (Left) and Trump (Right) town halls from October 15th, 2020.

Results and Biases

Examining the results from the distribution of emotions among all parties, we begin to see some trends.

Overall the most common emotions in the debates were “sad” and “angry” whereas the least common emotions were “surprise” and “disgust”.

A possible reason for this is the fact that primary discussion points involve serious issues currently in society. The expressions of people talking about things like the pandemic and protests are going to be empathetic and stern resulting in more “sad” and “angry” classifications.

More specifically, Biden’s most frequent emotion was recognized as “sad” whereas both Trump and Pence showed similar amounts of “sad” and “angry”. One possibility for the model predicting Pence as being “angry” more often is that he has naturally strong eyebrows, as you can see below.

Harris was the only person to have “happy” as one of their leading emotions. This is apparent when looking at the video as she would regularly default to a smiling face when the other person was talking which is not something the other candidates often did.

Images from the competing October 15th town halls

The footage and statistics from the town halls match what was seen during the debates in that the primary expression from Biden was “sad” and from Trump was “sad” and “angry”. Though other expressions like “neutral” and “happy” were significantly higher when the two candidates were not debating one another directly.

Why was Harris so much happier?

Multiple studies [7,8,9,10] have shown that there is often a “catch-22” for women in leadership positions where they are expected to be both determined and tough (traits often expected from good leaders) while also being caring and nurturing (societal gender norms for women). The result is that women like Senator Harris may be more likely to show happier, more comforting expressions than men in the exact same role.

To analyze this, we loaded the dataset “Labeled Faces in the Wild” [11], which is a dataset of roughly 13,000 images of faces taken in various conditions in the “wild”. Running this dataset through the expression recognition model and loading it into FiftyOne, we were able to find correlations between different groups of people and their emotions in aggregate.

Examples of male and female faces from Labeled Faces in the Wild

By sorting between men and women, we see a similar trend in the distribution of emotions. Namely that women are on average showing the “happy” emotion more often than men. There could be a few different reasons for this. The dataset could include more images of female celebrities compared to a diverse selection of male faces. Celebrities at a function are more likely to have smiling photos than the average candid person. Or, it’s possible that the model was trained on a dataset that contained women smiling more than men resulting in a bias in the model towards classifying women as “happy”.

Left: Distribution of emotions for females, Right: Distribution of emotions for males

Searching Clips

How can I use this information? We are able to create another dataset of video clips with subtitles downloaded from the YouTube recording of the debates. Loading these clips into FiftyOne and adding the emotion predictions from our model allows us to search through the debate along with both the content of what is being said and how it is being said. For example, we could query the dataset for clips where Trump is discussing any topic (for example “taxes”) with any of the 6 expressions (for example “happy”).

There have been claims of Harris smiling when talking about the death of Kayla Mueller [12]. We can quickly find these clips by searching for a mention of Mueller. Here we see that Harris was not smiling and instead the sides of her mouth rose when she said the word “case”. This resulted in what appears like a smile if you only see at one frame. Looking at the distribution of emotions from this portion of the debate, you can see that the occurrences of “happy” from Harris are much lower than the rest of the debate indicating a change in emotion during this tragic topic.

Example footage and emotion distribution during the portion of the debate related to Kayla Mueller.

Summary

Emotions were running hot during the presidential and vice-presidential debates with a few small discrepancies when compared to the town halls a few weeks later. We were able to use AI to detect the facial expressions from the debate and town hall footage to analyze the emotions of each candidate. “Sad” was the most prominent emotion throughout the debates and the primary emotion of Biden. Trump and Pence also had a fair amount of “angry” and Harris showed comparatively high levels of “happy”.

References

[1] Y. Huang, “How Digital Contact Tracing Slowed Covid-19 in East Asia”, Harvard Business Review (2020)

[2] BriefCam, “BriefCam announces video analytics innovation for contact tracing, physical distancing, occupancy management and face mask detection”, Security InfoWatch (2020)

[3] Facial Expression Recognition, https://github.com/justinshenk/fer

[4] J. Paul, “How I developed a C.N.N. that recognizes emotions and broke into the Kaggle top 10", Medium (2018)

[5] Voxel51, FiftyOne (2020)

[6] G. Bradski, “The OpenCV Library.”, Dr. Dobb’s Journal of Software Tools. (2000)

[7] M. Heilman, et al., “Why are women penalized for success at male tasks?: the implied communality deficit.” Journal of applied psychology 92.1 (2007): 81.

[8] R. Bongiorno, et al., “If you’re going to be a leader, at least act like it! Prejudice towards women who are tentative in leader roles.” British Journal of Social Psychology 53.2 (2014): 217–234.

[9] W. Zheng, et al., “How Women Manage the Gendered Norms of Leadership”, Harvard Business Review (2018)

[10] J. McGregor, “Women leaders and the Goldilocks syndrome: Not too harsh, not too soft”, The Washington Post (2013)

[11] G. Huang, et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments (2007)

[12] Fact check: Kamala Harris was not smiling when discussing the death of Kayla Mueller, Reuters (2020)