Machine Learning Analysis of Ghana’s Presidential Election — Part 1

Initial analysis of the Twitter conversation before and during the election day using machine learning.

Philip Adzanoukpe
Extra Newsfeed
8 min readJan 9, 2017

--

Sentiment analysis deals with identifying and classifying opinions or sentiments expressed in source text.

In this age of modern technology, there is one resource that we have in abundance: a large amount of data. Social media is generating a vast amount of sentiment rich data in the form of tweets, status updates, blog posts etc. Sentiment analysis of this user generated data is very useful in knowing the opinion of the crowd.

In the second half of the twentieth century, machine learning evolved as a subfield of artificial intelligence that involved the development of self-learning algorithms to gain knowledge from data in order to make predictions. Instead of requiring humans to manually derive rules and build models from analysing large amounts of data, machine learning offers a more efficient alternative for capturing the knowledge in data, to gradually improve the performance of predictive models and make data-driven decisions. Not only is machine learning becoming increasingly important in computer science research but it also plays an ever greater role in our everyday life. Thanks to machine learning, we enjoy robust e-mail spam filters, convenient text and voice recognition software, reliable Web search engines, challenging chess players, and, hopefully soon, safe and efficient self-driving cars.

Machine learning is the strategy we’ll be using to analyze the sentiments from the tweets on Ghana’s 7th presidential election of 2016.

The objective is to get an insight on the tweets and quality of conversations for the duration of the election; more to that, to develop an understanding of the influence of social media and, particularly, Twitter on Elections in Ghana.

NB: The twitter users are from a specific demographic which includes the middle class, tech savvy or corporates between the ages of 18–45. This is actually a good proportion of the urban semi affluent and affluent voter cohorts. So further testing and evaluation for a larger percentage of the population is recommended.

Orientation

When conducting an analysis of an overall social network conversation on a specific subject, it is important to first get an understanding of the general “landscape”.

Data was collected from trending hashtags on the elections and also hashtags used by the major candidates; tweets were collected from October 2016 at the peak of the political campaigns until December 7, 2016 5PM (GMT +0) when polls were finally closed. The data collected in total was about 65k tweets.

The tools used were:

The algorithms used for the analysis were: Logistic regression for the Sentiment Analysis and Collapsed Gibbs sampling was used in detecting emotions in tweets using emotion lexicons from NRC Word-Emotion Association Lexicon.

This analysis will be of two parts. This first part consisting of tweets before and during the election; and the second part will be post-election tweets.

Breakdown of the process

  • Get data from Twitter from trending hashtags
  • clean-up data (remove links, punctuations, symbols e.t.c)
  • Exploration (Visualizations and analysis)

Observations made from Analysis

Trending Terms

A word cloud shows us the best visualization of terms that were most tweeted within the sample size of tweets that was collected. Note that the larger the word in the word cloud, the more it was tweeted about.

Word cloud of top 500 terms in the tweets

From this word cloud we can see some of the top terms used in the tweets were: VoteforJMnumber3 , JMToaso , IamVotingForChange, Mahama VoteForChange, ChooseChange e.t.c.

Looking further into this visualization we can see Change [which is associated with the major opposition party (winner of the polls)- NPP] is one of the dominant root word found in the word cloud.

Top Users

The top tweeters in this analysis were ranked in 3 categories; Tweet Count (number of tweets per user), Tweet Score (sum of retweets and favourites of each tweet) and Average Score per Tweet.

Bar Charts of Top Tweeting Users in the Data

The first thing to notice in this chart is that the Twitter Ids (TransformingGh, OfficialNDC and FirstLadyGhana) which are associated with NDC, and are in the top 10 users in all the 3 categories measured. Note that both H.E. President Nana Addo Dankwa Akufo-Addo, and Dr. Mahamudu Bawumia, Vice President of Ghana also appeared in 2 of the categories.

This is one of the tweets from President Akufo-Addo on the 5th of December.

Top Hashtags

These hashtags were observed over a period around the election and they were selected as the trending hashtags that users, candidates and journalists were using to tweet about the election. In this analysis we determined the top hashtags based on the average score of the tweets in the hashtag.

Top Hashtags Based on Avg. Score of Tweets

We can see TransformingGhana and Vote4Change are the 2 top hashtags. The former is mainly associated with Ex-President, H.E. John Dramani Mahama and the tweets were tailored towards projects and developments during his term as President of the Republic of Ghana.

Top 10 tweets

This list contains top tweets sorted by the scores of each tweets.

Top 10 tweets based on score

In this table it’s obvious that most of the top tweets were around December 7, the day of the election. Also we can see the tweet from the President (with Twitter id NAkufoAddo) and the Vice President (with Twitter id MBawumia) appear in the top 10 tweets.

Preview of the top tweet

Polarity —What was the sentiment displayed by tweeters?

Sentiment Analysis is the process of determining whether a piece of writing is positive, negative or neutral. It’s also known as opinion mining, deriving the opinion or attitude of a writer.

In machine learning, this involves using a supervised learning algorithm to build a classifier that will detect polarity of textual data and classify it as either positive, negative or neutral. Here the model was trained with review data from Amazon and Yelp using Logistic Regression.

As you might notice, a high percentage of the tweets had a positive sentiment as compared to negative sentiment.

Sentiment Polarity of Tweets
Sentiment Polarity Of Tweets Grouped By Hashtag

Emotions displayed by the twitters

Emotion can be expressed in many ways that can be seen such as facial expressions and gestures, speech and written text. Emotion Detection in text documents is essentially a content-based classification problem involving concepts from the domains of Natural Language Processing as well as Machine Learning.

Emotion is expressed as joy, sadness, anger, surprise, hate, fear and so on. Here the NRC Emotion Lexicon was used, which is a list of English words and their associations with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and the Collapsed Gibbs sampling algorithm to detect the emotions in the tweets. This works by identifying words that occur in the tweet and it’s association with an emotion from the emotion lexicon in a meaningful way.

As seen, Trust, Surprise and Joy came out as the top emotions from the users on Twitter.

Emotion displayed on all Tweets

A look further into the emotions displayed by the tweets associated with the 2 major parties, NDC and NPP.

Emotion displayed in tweets of Major Political Parties

This shows that Joy and Anticipation were the top emotions displayed in the tweets associated with NDC.

This is one of the tweets from former first lady, Lordina Mahama (NDC) that displayed Joy from the analysis

The tweet above clearly shows the kind of joy and faith the first lady has about the government in the hands of her husband during his term as president; which was a prevalent opinion shared by members of the NDC as shown by the sentiment analysis.

Another tweet from NDC showing anticipation from the analysis

This particular tweet brings into focus the anticipatory mood of most NDC supporters in expectance of more infrastructures (Schools, Hospitals, Roads etc). The inclusion of a couple slang terms like ‘onaapo — You won’t get’, ‘jmtoaso — JM Continue’ highlights this mindset which was associated with most NDC tweets.

On the contrary, Surprise and Fear emotions were displayed in the tweets associated with the NPP.

This is one of the tweets from H.E President Akufo-Addo (NPP) that displayed Surprise from the analysis.

NPP tweet displaying surprise emotion from the President.

The President’s tweet above shows surprise/astonishment at how the people are rallying for the cry of change for the economic status of the country.

Another tweet from the President showing fear for the future of the youth
Another tweet from the Vice President showing fear for the country in the hands of the former Government.

It is evident that the emotion that has greatly influenced the campaign of NPP has been that of fear, disgust and anger for the former government run by the NDC; as seen in the tweet above; fear for the state and the future of the ‘jobless youth’.

Summary of Observations

Here is the summary of the observations made from the analysis of the data collected.

  • VoteforJMnumber3 , JMToaso , IamVotingForChange, Mahama VoteForChange, ChooseChange were the top trending terms.
  • The Twitter Id’s Citi973 and NAkufoAddo (the president) had the best performing tweets.
  • TransformingGhana and Vote4Change were the top ranking hashtags based on average score per tweet.
  • Sentiment polarity was mostly positive for tweets associated with both the former ruling party (NDC) and the ruling party (NPP).
  • Trust, Surprise and Joy are the top 3 emotions displayed by the users in their tweets.
  • Most of the tweets associated with the former government (NDC) showed Joy and Anticipation emotions.
  • Most of the tweets associated with the current government (NPP) showed Surprise and Fear emotions.

Conclusion

What impact does this have on the election? Indeed we had a successful and peaceful election, but can this be used to predict the outcome of future elections?

In general the tweets showed positive sentiments; which is a good indicator of happiness, enthusiasm, kindness of Ghanaians leading to peaceful election.

However, Joy and Anticipation found in tweets associated with the NDC didn’t amass a lot votes for them to win the 2016 Presidential election. This might be that the Joy and Anticipation being portrayed in respect to the development in his term of office was not enough, or better Ghanaians were focused on the economic hardship in the country and they wanted a change of Government.

There was a change indeed, the NPP won the election and from the analysis they portrayed a state of hardship, jobless and incompetent government through Fear and Disgust. This shows as seen in the Surprise [astonished] campaign for change of government.

At this point, Analysis on tweets can be a good source research to identify opinions of Ghanaians to predict outcomes major events in the Country.

Congratulations to the President, the winning party NPP, and to all Ghanaians for a peaceful resolution to the change in leadership and we hope this serves to usher in a new path for our motherland Ghana. It is worth noting that our love for peace during this election has brought some notable slang terms and songs that has heralded this change in government. So in ending [Onaapo and Nana Addo toaso; the sceptre is in your competent and capable hands].

God bless you, and God bless our homeland Ghana.

Links to source code and dataset for this analysis can be found on Github code repository.

--

--