YouTube’s A.I. was divisive in the US presidential election
YouTube’s Artificial Intelligence (A.I.) recommends tens of billions of videos every single day, yielding billions of views. On the eve of the US Presidential Election, we gathered recommendation data on the two main candidates, and found that more than 80% of recommended videos were favorable to Trump, whether the initial query was “Trump” or “Clinton”. A large proportion of these recommendations were divisive and fake news.
We propose two transparency metrics to elucidate the impact of A.I. on the propagation of political opinions and fake news.
Following the recommendations
To measure which candidate was recommended the most by YouTube’s A.I. during the US presidential election, we used the following methodology: We wrote a program that searched for Trump and Clinton, saved the N first search results, and followed the N top recommendations, N times. For N=5, this yielded to a total of 842 unique videos when searching for “Trump” and 796 unique videos when searching for “Clinton.” It was run on the eve of the presidential election. The code is available here.
The diagrams below provide a view of videos recommended by YouTube’s A.I. in response to searches about “Trump,” and, respectively, “Clinton,” just before the election.
“Trump” recommendations: interactive version here
Surprisingly, a “Clinton” search on the eve of the election led to mostly anti-Clinton videos. The pro-Clinton videos were viewed many times and had high ratings, but represent only less than 20% of all recommended videos.
Why YouTube’s A.I. is “neutral” towards clicks, but partisan towards candidates
YouTube’s A.I. is optimized to maximize time spent online and clicks; the combination of those is called engagement. Hence, recommendations are aligned with engagement.
If the belief that “the earth is flat” makes users spend more time on YouTube than “the earth is round”, the recommendation A.I. will be more likely to suggest videos that advocate the former theory than the latter.
Searching from “is the earth flat or round?” and following recommendations five times, we find that more than 90% of recommended videos state that the earth is flat.
This “flat earth” example illustrates that recommendations can be aligned with any theory that generates engagement, regardless of facts or popular belief.
Alignment towards divisiveness
Maximizing engagement can create alignment towards values and behaviors that are particularly engaging for a small group of people at the cost of others, such as racism, homophobia, sexism, xenophobia, bullying, religious hatred, violence, or conspiracies. Tay.ai, the Twitter chatbot from Microsoft, showed support for all of these, within 24 hours.
Both A.I.s (Tay.ai and YouTube A.I.) receive their inputs from a population of human users, and are optimized to maximize engagement. One difference between the two A.I.s, is that Tay.ai’s (mis)alignment became public and obvious after a few tweets, and the A.I. was promptly discontinued due to ethical and moral concerns following its endorsement of extremist values. In contrast, understanding YouTube’s A.I. alignment requires the ability to “see the forest for the trees” by scraping the recommendation network and watching hundreds of recommended videos. This lack of transparency in video recommendations makes deviations from ethical and moral norms more difficult to detect.
We need transparency to make YouTube’s alignment easier to observe in real-time.
We propose that A.I. recommendation transparency can be increased by answering two questions:
- What does YouTube recommend in average?
Giving users the option to see a random recommendation made on a given day, as can be done on Twitter and other sites, would enable the public to assess YouTube’s A.I. general alignment.
- Does the A.I. favor specific videos?
To answer this question, three variables are needed: the number of views, the number of views resulting from AI recommendations, and the total number of recommendations (the number of views on a video is already public.)
Using these three variables we can compute:
a) A.I. boost = (views from recommendations) / views
This metric shows the influence of the A.I. in the views of a given video.
b) A.I. efficiency = (views from recommendations) / recommendations
This metric shows how successful the A.I. is at recommending a given video.
The combination of these metrics can help understand A.I. alignment, and its role in the recommendations of fake news. For instance:
Are fake news on YouTube spread primarily by humans or by the A.I.?
Several hundred videos recommended by the A.I. on the eve of the election conveyed fake news or misleading information. For instance:
- BREAKING: VIDEO SHOWING BILL CLINTON RAPING 13 YR-OLD WILL PLUNGE RACE INTO CHAOS ANONYMOUS CLAIMS (2,274,526 views)
- YOKO ONO: “I HAD AN AFFAIR WITH HILLARY CLINTON IN THE ’70S” (1,457,089 views)
- “BREAKING: Michael Moore Admits Trump Is Right” (1,238,606 views)
- “PROOF NWO & ILLUMINATI ARE REAL & HISTORY IS A LIE” (1,227,853 views)
- “FBI Exposes Clinton Pedophile Satanic Network” (1,221,071 views)
- “ALERT SHOCKING! ANONYMOUS TO REVEAL BILL CLINTON PEDO VIDEO HILLARY FOR PRISON!” (1,118,067 views)
Were these fake news viewed so many times because they were shared by humans on Facebook/ Twitter/etc., or because they were pushed by the recommendation A.I.? Knowing the “A.I. boost” for these videos would answer this question.
We have examined the network of YouTube A.I. recommendations and seen that YouTube’s A.I. mostly recommended Trump before the election. We also found that recommendations favored the flat earth theory. Is YouTube’s A.I. always favoring specific candidates or values? Are those candidates or values similar to the ones Tay.ai favored? Were YouTube fake news shared primarily by the A.I. or by humans? We propose transparency metrics to answer these questions.
Guillaume Chaslot, ex-Microsoft and Google engineer, CEO at IntuitiveAI
Andreea Gorbatai, Assistant Professor, UC Berkeley
Edit: Analyzing the hundreds of recommendations led to observing numerous type of bias: asymmetry of recommendations towards candidates, fake news, anti-media resentment, etc… The most prevalent bias seemed to me that recommendations were divisive. The asymmetry towards one candidate might be explained by the fact that he was more divisive than his opponent.