This post summarizes our research paper Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media. The paper will be presented at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2018).
Misinformation, or recently “fake news”, is a social ill that can spread wildly on social media platforms. The “success” of such epidemics is thanks to both the pathogen and the host. On one hand, misinformation often uses inflammatory and sensational language to manipulate people’s emotions and stoke the fires of partisanship. On the other hand, psychological and sociological studies tell us that people fundamentally are vulnerable to such misinformation. For example, the theory of “naive realism” suggests that people tend to believe that their perceptions of reality are accurate, and views that disagree with their perceptions are uninformed, irrational, and biased; the theory of “confirmation bias” suggests that people prefer to accept information that confirms their existing beliefs.
Fact-checking, on the other hand, tries to use evidence to rebut misinformation. However, despite its noble intentions, people react to fact-checking differently. Some studies have found that fact-checking has corrective effects on people’s beliefs, while other studies have found that it has minimal impact and sometimes even “backfires” on its audience. In fact, the work of Snopes and PolitiFact has itself become politicized by those who view their work as biased, and this has led to attempts to discredit fact-check articles.
In our recent work, we looked at how people react to misinformation and fact-checking on social media. We collected over five thousands social media posts with over two million user comments from Facebook, Twitter, and YouTube, and associated these posts to fact-check articles from Snopes and PolitiFact for veracity scoring (i.e., from true to false). Then, we used natural language processing techniques to discover different word usage patterns in user comments on true and untrue posts, as well as before and after the associated fact-checking article was published.
People Get Touchy about Misinformation…
We found that people are more likely to get “touchy” when commenting on posts containing misinformation, as compared to truthful posts. Specifically:
- More swear words are used. We observe people posting comments containing a variety of swear words when the associated post contains misinformation, ranging from casual swears (“damn”) and abbreviated forms (“fu”), to belittling words (“moron”), to hate speech against minorities. This suggests that people are more easily angered when reading about and discussing stories containing misinformation.
- More emojis are used. People tend to use a wide variety of emojis when commenting on misinformation, such as anger (“😡”, “👎”, “💩”), laughter (“😂”, “🤣”, “😆”), sadness (“😭”, “😢”, “💔”), and hand gestures (“🙏”, “👏”, “👍”). Given the popularity of emojis, we view them as important proxies for people’s actual emotional state.
- Less likely to discuss concrete topics. Under truthful posts, people are more likely to use words in their comments that touch on concrete topics, issues, and policies. This includes economic plans (“bill”, “budget”, “policy”), monetary issues (“money”, “tax”, “dollar”), social issues (“people”, “public”, “worker”), government (“law”, “system”), and healthcare (“health”, “insurance”). However, these topics are less likely to emerge in comments on misinformation. This suggests that misinformation discourages reasoned, concrete conversation.
- Less objective and more subjective. We find that people are more likely to use superlatives (“dumbest thing I’ve seen today”) and less likely to use comparatives (“better”, “bigger”) when commenting on misinformation. The relationship between subjectivity and objectivity has long been studied within the context of people’s emotions in sociology.
These results confirm many of our intuitions about how people respond to truthful and untruthful information online. Social media posts written in good-faith, based on facts do appear to engender commensurate comments that are, on average, concrete and reasoned. In contrast, misinformation brings out the worst in people, inflaming emotions and degenerating into shouting matches.
… and about the Truth Too
We also found that people are somewhat more likely to get “touchy” after misinformation is fact-checked. This manifests as an increased use of swear words in comments. The above figure shows three examples where people referred to fact-checking websites and used swear words to express their dissatisfaction. These “backfire” comments sometimes express doubt about the fact-checker themselves because the commenters perceive them to be biased and unreliable sources.
In this work, we develop a novel lexicon, trained on fact-checked social media data, to understand the “linguistic signals” generated by social media users when confronted with misinformation. We have open-sourced our datasets and lexicon, named ComLex, as a resource for the research community.
One of the interesting aspects of our study is that we show that linguistic signals have predictive power with respect to identifying misinformation. We hope that this technique can help lead to better automated systems for identifying misinformation, which could then trigger human moderation, design interventions that present alternative sources of information, or even help direct the attention of fact-checkers towards emerging misinformation narratives.
The causal reasons underlying our observations about the linguistic behavior of social media users remain unclear. Is the audience for misinformation self-selected to include hyper-partisans? Or, do algorithmic curation systems on the social media platforms promote misinformation to a susceptible audience, perhaps mistaking the ensuing heated discussions as positive engagement? Alternatively, is it the construction of misinformation itself, through the use of appeals to emotion and moral-superiority, that provokes strong reactions from the audience? More work is needed to disentangle why people are susceptible to misinformation, or choose to engage with it in the first place.
Contact firstname.lastname@example.org with questions or comments on the study.
Citation: Shan Jiang and Christo Wilson. 2018. Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media. Proceedings of the ACM on Human-Computer Interaction, CSCW, Article 82 (November 2018), 23 pages. https://doi.org/10.1145/3274351