A Quick Analysis of Italian 2018 General Election Candidates’ Tweets

Alessandro Scoccia Pappagallo
Unkempt Thoughts
Published in
4 min readSep 11, 2018

This analysis first appeared on January 10, 2018 on my personal GitHub Page. You can find all the code for this analysis here. The cover image has been unashamedly stolen from an article appeared on Financial Times.

Introduction

The following data comes from termometropolitico.it and refer to the period of time 18 December 2017 to 24 December 2017. Only parties with 5%+ have been included.

†Many would argue that the de facto leader of the party is Beppe Grillo; as Luigi di Maio won the most recent primary elections, I decided to go with him.

††In the rest of the post I will refer as these politicians simply as candidates, even though Berlusconi may or may not be the actual prime minister candidate of his party.

The Analysis

Data Collection

The data have been collected using Twitter API (via tweepy). Specifically:

  • Used Twitter API to get all the tweets posted by candidates from January 1, 2017 to December 24, 2017 (extremes included). Retweets were ignored.
  • The code for data collection is available here.
  • For the duration of the analysis the tweets were stored in a local database to avoid re-querying the Twitter API multiple times (both raw and preprocessed data are available on GitHub).

Descriptive Analysis

For each candidate, a number of descriptive statistics were computed. The full code can be found here.

Note: because of the way Twitter works, average_links refers to both external links and images.

Sentiment Analysis

I used the Polyglot package to compute some rough sentiment scores for each candidate’s tweets. I chose Polyglot as it’s one of the few packages to offer localization in Italian. Note that Polyglot only offers a polarity score (-1.0, 0.0 or +1.0) for words. Sentiment scores were then computed by averaging the polarity scores for each token.

Specifically:

  • Loaded the preprocessed data from TinyDB.
  • Computed polarity for each token via Polyglot.
  • For each tweet an average polarity was computed (ignoring tokens with polarity equal to zero).
  • Tweets with an average polarity (i.e. sentiment score) smaller than zero were labeled as negative, with an average polarity equal to zero neutraland with an average polarity higher than zero positive.
  • The code is available here.

The results are summarized below. Candidates are presented from far-left to far-right, with anti-establishment party M5S in the middle.

Right parties seem to have an higher percentage of negative tweets. While this could be the result of a precise communication strategy, it’s also important to note that the government was left-wing in 2017 and thus it makes sense for right parties to be more critical about the overall economical and political state of the country.

Keywords Analysis

For each candidate, a list of 25 keywords was computed by analysing their tweets and comparing them against the other five candidates.

Specifically:

  • Loaded the pre-processed data from TinyDB.
  • Created a single string containing for each candidate all of her or his tweets.
  • Computed the tf-idf matrix.
  • Selected the top 25 words with highest tf-idf score for each candidate.

The results are summarized below. Candidates are presented from far-left to far-right, with anti-establishment party M5S in the middle.

Few observations in random order:

  • Many keywords in Grasso’s vocabulary refer to Mafia (e.g., victims, 9may, Palermo, mafia, killed, etc.), which makes sense given that Grasso has been for many years Prosecutor at the Court of Palermo;
  • Luigi Di Maio, Giorgia Meloni and Matteo Salvini all have renzi in their vocabulary (referring to Matteo Renzi, leader of PD). Interestingly, Berlusconi doesn’t;
  • Unsurprisingly, the two far-right parties all have many immigration-related keywords (e.g., immigrants, immigration, stoptheinvasion, italiansfirst, etc.);
  • Salvini is the only candidate to have his own name as a keyword. This may be simply the result of him having named his “sub-party” Noi con Salvini (en: Us With Salvini).

It’s also interesting to note how candidates do not seem to be communicating about the same issues from different points of view (e.g., immigration is good vs. immigration is bad) as much as talking about completely different topics. In other words, nobody seems to be offering an opposite narrative to other candidates’.

--

--

Alessandro Scoccia Pappagallo
Unkempt Thoughts

T&S Manager @ Google | ML Enthusiast | People think that the human brain is in the head. Nothing of the sort; it is carried by the wind from the Caspian Sea.