YouTube’s A.I. was not neutral in the US presidential election

YouTube’s Artificial Intelligence (A.I.) recommends tens of billions of videos every single day, yielding billions of views. On the eve of the US Presidential Election, we gathered recommendation data on the two main candidates, and found that more than 80% of recommended videos were favorable to Trump, whether the initial query was “Trump” or “Clinton”. A large proportion of these recommendations were fake news. 
We propose two transparency metrics to elucidate the impact of A.I. on the propagation of political opinions and fake news.

Following the recommendations

To measure which candidate was recommended the most by YouTube’s A.I. during the US presidential election, we used the following methodology: We wrote a program that searched for Trump and Clinton, saved the N first search results, and followed the N top recommendations, N times. For N=5, this yielded to a total of 842 unique videos when searching for “Trump” and 796 unique videos when searching for “Clinton.” It was run on the eve of the presidential election. The code is available here.

The diagrams below provide a view of videos recommended by YouTube’s A.I. in response to searches about “Trump,” and, respectively, “Clinton,” just before the election.

“Trump” recommendations: interactive version here

Labels have been manually-generated and are not based on the YouTube classification scheme. Check the interactive version for further details on each video.

As expected, most recommendations from a “Trump” search lead to pro-Trump videos, with few pro-Clinton videos. This is consistent with the classic “filter bubble” effect.

“Clinton” recommendations: interactive version here

Labels have been manually-generated and are not based on the YouTube classification scheme. Check the interactive version for further details on each video.

Surprisingly, a “Clinton” search on the eve of the election led to mostly anti-Clinton videos. The pro-Clinton videos were viewed many times and had high ratings, but represent only less than 20% of all recommended videos.

Why YouTube’s A.I. is “neutral” towards clicks, but partisan towards candidates

YouTube’s A.I. is optimized to maximize time spent online and clicks; the combination of those is called engagement. Hence, recommendations are aligned with engagement.

If the belief that “the earth is flat” makes users spend more time on YouTube than “the earth is round”, the recommendation A.I. will be more likely to suggest videos that advocate the former theory than the latter.

Searching from “is the earth flat or round?” and following recommendations five times, we find that more than 90% of recommended videos state that the earth is flat.

This “flat earth” example illustrates that recommendations can be aligned with any theory that generates engagement, regardless of facts or popular belief.

Alignment towards hatred

Maximizing engagement can create alignment towards values and behaviors that are particularly engaging for a small group of people at the cost of others, such as racism, homophobia, sexism, xenophobia, bullying, religious hatred, violence, or conspiracies. Tay.ai, the Twitter chatbot from Microsoft, showed support for all of these, within 24 hours.

Tay learnt from online interactions that this tweet was likely to generate engagement. However, the algorithm was unable to assess the full consequences of such a statement. We believe that YouTube’s A.I. has similar issues.

Both A.I.s (Tay.ai and YouTube A.I.) receive their inputs from a population of human users, and are optimized to maximize engagement. One difference between the two A.I.s, is that Tay.ai’s (mis)alignment became public and obvious after a few tweets, and the A.I. was promptly discontinued due to ethical and moral concerns following its endorsement of extremist values. In contrast, understanding YouTube’s A.I. alignment requires the ability to “see the forest for the trees” by scraping the recommendation network and watching hundreds of recommended videos. This lack of transparency in video recommendations makes deviations from ethical and moral norms more difficult to detect.

We need transparency to make YouTube’s alignment easier to observe in real-time.

YouTube Transparency

We propose that A.I. recommendation transparency can be increased by answering two questions:

  1. What does YouTube recommend in average?
    Giving users the option to see a random recommendation made on a given day, as can be done on Twitter and other sites, would enable the public to assess YouTube’s A.I. general alignment.
  2. Does the A.I. favor specific videos?
    To answer this question, three variables are needed: the number of views, the number of views resulting from AI recommendations, and the total number of recommendations (the number of views on a video is already public.)

Using these three variables we can compute:

a) A.I. boost = (views from recommendations) / views 
This metric shows the influence of the A.I. in the views of a given video.

b) A.I. efficiency = (views from recommendations) / recommendations 
This metric shows how successful the A.I. is at recommending a given video.

The combination of these metrics can help understand A.I. alignment, and its role in the recommendations of fake news. For instance:

Are fake news on YouTube spread primarily by humans or by the A.I.?

Several hundred videos recommended by the A.I. on the eve of the election conveyed fake news or misleading information. For instance:

Were these fake news viewed so many times because they were shared by humans on Facebook/ Twitter/etc., or because they were pushed by the recommendation A.I.? Knowing the “A.I. boost” for these videos would answer this question.

To see the full list: 
The list of the 796 recommended videos from the “Clinton” search is here;
The list of the 842 recommended videos from the “Trump” search is here.

Conclusion

We have examined the network of YouTube A.I. recommendations and seen that YouTube’s A.I. mostly recommended Trump before the election. We also found that recommendations favored the flat earth theory. Is YouTube’s A.I. always favoring specific candidates or values? Are those candidates or values similar to the ones Tay.ai favored? Were YouTube fake news shared primarily by the A.I. or by humans? We propose transparency metrics to answer these questions.

Guillaume Chaslot, ex-Microsoft and Google engineer, CEO at IntuitiveAI
Andreea Gorbatai, Assistant Professor, UC Berkeley