Bias Tracker: Understanding sentiment in the runup to the Italian elections
At last year’s TechCamp Reconnect event in Warsaw, editor Michelangelo ‘Ugo’ Barbara asked me an interesting question: How might we use machine learning to identify articles that are true or not? My response was that while the question of automating truth is still quite far away, it’s fairly straightforward these days to identify sentiment, and that when we know the sentiment of an article, we can use that for a number of different things.
That idea led Ugo and I to start a prototype project to further test our ideas. We were soon joined by the outstanding developer Edouard Richard and Fabio Giglietto of the University of Urbino, to create a tool we call the Bias Tracker, which has gone live today.
The Bias Tracker automatically detects sentiment related to an entity — a person, place or thing — and then tracks that. Then, using small bar charts called “sparklines,” we are able to show how sentiment develops in articles that include that entity over time. Users can click on the charts to view individual Facebook posts and see the items as well as their sentiment scores.
The Bias Tracker has now analyzed more than 57 thousand posts on Facebook by Italian media organizations and other prominent pages in the runup to the Italian general elections March 3, and we have already found some interesting things, including clear illustrations of the partisanship of Italian media outlets. Posts on Giorgia Meloni, the leader of the right-wing Brothers of Italy party, are highly negative in the media outlets Il Fatto Quotidiano and Libero, but are extremely positive in Linkiesta.
Our theory is that strong sentiment in a post is an indicator of highly partisan content, and that partisan content is a key component of what has been called “fake news,” even though many in the verification community have started calling it “digital propaganda.”
Claire Wardle and Hossein Derakshan explain it well in their outstanding study, “Information Disorder: Toward an interdisciplinary framework for research and policymaking.” They write:
“The most ‘successful’ of problematic content is that which plays on people’s emotions, encouraging feelings of superiority, anger or fear. That’s because these factors drive resharing among people who want to connect with their online communities and ‘tribes’. When most social platforms are engineered for people to publicly ‘perform’ through likes, comments or shares, it’s easy to understand why emotional content travels so quickly and widely, even as we see an explosion in fact-checking and debunking organizations.”
In other words, detecting strong sentiment is key to understanding the virality of content.
What we’ve been able to do with the Bias Tracker is to automate sentiment analysis using tools linked together in a chain, including C.J. Hutto’s VADER sentiment analysis library for Python, and the OpenCalais service for entity extraction. Because sentiment analysis tools are not widely supported in languages other than English, we also run the posts through Google Translate to translate from Italian to English, and then we run the automated translations through VADER. The automated translations are presented to the user below the Facebook posts, as well as the numerical sentiment scores.
With VADER, sentiment is evaluated in three numerical scores: Positive, neutral and negative. The higher the positive or negative score, the stronger the sentiment. We then use this to summarize the post with a happy or sad face.
The Bias Tracker is a prototype, but with more than 57 thousand posts, 22 sources (mostly media organizations), and more than 14 thousand entities, we think it works fairly well at illustrating what’s going on in Italian media, and of providing a quick overview of sentiment toward a large number of entities. We also provide a view of the top topics covered by media organizations; grouped from most active to least active. We think this is useful because automation is key to analyzing content at scale.
There are certainly areas to improve. Entity extraction from OpenCalais has been less than spectacular at picking up Italian entities, and ideally VADER would have a dictionary in Italian, as opposed to relying on Google Translate. But we have been pleasantly surprised at the overall gist of the translations — certainly there is room for improvement, but for our purposes they have been good enough. And there’s always user interface to improve for editors to manually adjust tags and entities.
That’s where you can come in. The Bias Tracker prototype is open source, is available on Github, and has been designed with a GraphQL API. We welcome tests and suggestions as to how to improve the tool, and are interested in partnering with other news organizations and academic institutions to further develop tools and methods in this area.