Tay: The Racist Twitter Chatbot

Ali Baker
SI 410: Ethics and Information Technology
3 min readFeb 11, 2021

What if I told you that your tweets are currently training AI’s all around the world? With there being so much data on social media platforms about how we humans act and behave, many big tech companies such as Microsoft have started projects to use that data to train their AI. This begs the question, is the data on these platforms representative of our society? The launch of the Twitter chatbot by Microsoft serves as a great example of why we shouldn’t rely on social media platforms to train our artificial intelligence systems.

In 2016, Microsoft launched a Twitter chatbot named “Tay”. The goal for Microsoft was to experiment with “conversational understanding” by analyzing tweets from users that interact with the chatbot. This would allow Tay to build a personality based on the content people send her. In an ideal world, this would seem like a cool idea to explore the limits and potential of artificial intelligence. Sadly, this experiment quickly soured due to the nature of the internet.

In less than 5 hours after her deployment, Tay turned into a racist and misogynistic chatbot. Why did this happen? Trolls around the internet began continuously and rapidly tweeting racist and abhorrent statements to Tay, and by design she became a racist chatbot.

An example of Tay’s racist tweets

This highlights a major issue, not with the chatbot herself, but the method in which the data is collected to train the AI. Some major tech publications, such as the Verge, have asked “how are we going to teach AI using public data without incorporating the worst traits of humanity?” The question they should be asking is, should we really collect data from social media platforms to train AI? There is no doubt that our activity on social media platforms reflect a few of our traits and characteristics, but Twitter is not representative of all our society. + attention-driven business model As pointed out in Boyd and Crawford’s paper on Big Data, “Twitter does not represent ‘all people’, and it is an error to assume ‘people’ and ‘Twitter users’ are synonymous: they are a very particular sub-set.”

To make a claim that this chatbot “mirrors” our society since it learns from our behavior on Twitter simplifies the problem. When collecting data to train our AI and models, we must be aware of what we are inputting into our systems. The Tay chatbot exemplifies this idea. Just because there is a lot of data on social media platforms to use, doesn’t mean it is ethical or correct to utilize in these types of projects.

  • Boyd, D., & Crawford, K. 2012. “Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon.” Information, communication & society, 15(5), 662–679.
  • Vincent, James. “Twitter Taught Microsoft’s AI Chatbot to Be a Racist Asshole in Less than a Day.” The Verge, The Verge, 24 Mar. 2016, www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist.

--

--