What Do People Think About Trump’s Immigration Suspension?
A deep dive into what everyone is saying about the immigration suspension and how they feel about it — using sentiment analytics and topical modeling.
When it comes to the controversial issue of immigration, there are usually two beliefs: either everyone has a right to enter the country or only some have the right. Whatever side you stand on we can all agree that this is a very touchy subject and that is why Trump’s most recent order to sign an immigration suspension has drawn a lot of attention.
The reason behind the executive order can cause a lot of skepticism as to “Why now?” Trump wants to put Americans first, or at least that’s what he claims.
“This will ensure that unemployed Americans of all backgrounds will be first in line for jobs as our economy reopens.” Trump said during his daily briefing on Wednesday. “It will also preserve our health care resources for American patients.” (Jackson & Collins, USAToday, 2020)
This would seem like a very logical decision but also falls under the agenda he has been highlighting since he took office.
So what does anyone do when President Trump does something very controversial?
They take their opinions and rants to social media of course! I don’t have much of an opinion on what the President does, but I wanted to get a sense of how everyone else feels about the choice he made. So I did a little bit of data mining on Twitter.
My Tools of choice
(Full disclosure: no one’s information is publicly disclosed and the data contains at least a thousand samples.)
For my sentiment analysis, I chose to utilize both lexicon dictionaries Afinn and Vader. Before I go on about the scoring dictionaries, sentiment analysis is a type of analytics used in data science to gain insight into how people feel on a certain subject and in this instance “Trump’s executive order”. Afinn, or “Afinn-111” to be exact, is a lexicon (lex·i·con: the vocabulary of a language, an individual speaker or group of speakers, or a subject)(“Definition of LEXICON,” 2020) of words from the English dictionary that are given scores from -5 to 5. Each word is given a unique value (i.e. the word “happy” = 5 and “sad” = -5, or “abandon” = -2 and “accept” = 1). The lexicon currently has 2477 words as of now. Vader stands for “Valence Aware Dictionary and sEntiment Reasoner” which is another English scoring dictionary that has over 9000 features (words) with assigned scores from -4 to 4 similar to the Afinn lexicon dictionary.
My next trick in the book is to use topical modeling with gensim.
Topical modeling is the combinational use of statistical and mathematical techniques to provide main topics on the subject(s) at hand (Sarkar,2019). I used lsi gensim with mallet to gather the topics on the tweets.
Before I go into the results of the sentiment analysis, let me define the different terms that I will be using and what they are referring to. A positive sentiment means that the tweet or string of text (sentence) in question is a positive review. The same goes for a negative sentiment being represented as a negative review and the neutral sentiment is also represented in the same fashion. Most of everything below will be about the aggregation (totaled averages) of all tweets but I will include a visualization of what the sentiment looks like for each tweet broken down. Without further ado, I present my findings.
The reaction (sentiment)
The visualization above is a stacked column chart of the one-thousand tweets mined from Twitter in descending sort based on the positive sentiment. I chose to use this visualization to show the sentiment value of every tweet taken into account. The sentiment of one tweet may have a high positive sentiment with no negativity and low neutrality (the very first column), and another may have relatively high negativity, no positive sentiment, and a lot of neutrality (most of the right half tweets). The total averages will be provided in the pie chart below to show the percentage of contribution each tweet has made towards a positive, negative, and neutral sentiment.
For the Vader sentiment, the majority of tweets were a neutral review of Trump’s decision. The question that comes to my mind is “How can this be if people are left and right about most controversial issues?” And so for that, I will provide the values for the Afinn sentiment.
The positive sentiment from the Afinn lexicon scoring was 35.8% and the negative was 64.2%.
Now I bet some of you are saying “Now ha! I knew that no one could be happy with Trump’s decision!” Don’t jump to conclusions though. Data science is all about taking different approaches and using different angles to tackle the data and gain insight. I have the Afinn algorithm specifically set up to only categorize a positive and negative sentiment. I do this to see if the scoring provides a different viewpoint with the Afinn dictionary in comparison to the Vader dictionary. But having sentimental value scores of tweets might not provide the big picture with how people are feeling and it is best to also include an idea of what they are talking about.
The Topics
The pie chart to the left shows the percentages of the total contributing tweets for each topic. When running the model, the machine came back with a coherence score above 60% declaring the most optimal number of topics to model was either two or three. I chose three because two seemed a bit small to me. To summarize what the first topic probably is we will rule out the most common terms we see in every topic which is “Trump”. The first topic terms are “ban,” “temporary,” “halt,” and “order”. From this, I can pretty much gather that the first topic is the universal topic in which everyone is commenting on (Trump’s temporary ban). However, the way the topical model works is it vectorizes (creates a matrix) all the n-grams (word combinations) with the python module (a pre-coded program) TF-IDF vectorizer which turns the tokenized (weighted) words (n-grams) into a vector of estimated inputs that will be used for modeling later.
So what can be taken from this dive into the data analysis? Is everyone neutral toward Trump’s decision, or are they on two different sides of the fence?
When I usually dive into data for a project I take a little bit longer than a day with more samples to gain a better insight so I can present something with certainty. But I also leave it up to the audience to decide for themselves what to gain from the analysis.
I hope that everyone who reads this article finds it to be informative, enjoyed, and gains something from it!
References:
Anon. (2014, September 17). scikit-learn TfidfVectorizer meaning? Retrieved April 23, 2020, from Stack Overflow website: https://stackoverflow.com/questions/25902119/scikit-learn-tfidfvectorizer-meaning
Cjhutto. (2020, April 16). cjhutto/vaderSentiment. Retrieved April 23, 2020, from GitHub website: https://github.com/cjhutto/vaderSentiment
Definition of LEXICON. (2020). Retrieved April 23, 2020, from Merriam-webster.com website: https://www.merriam-webster.com/dictionary/lexicon
Dipanjan, S. (2019). Text Analytics with Python A Practitioner’s Guide to Natural Language Processing Second Edition (2nd ed.). Dipanjan Sarkar.
Hvitfeldt, E. (2020, March 6). AFINN-111 dataset. Retrieved April 23, 2020, from Rdrr.io website: https://rdrr.io/cran/textdata/man/lexicon_afinn.html
Jackson, D., & Collins, M. (2020, April 22). Trump suspends immigration into U.S. during coronavirus pandemic, but impact potentially limited. Retrieved April 23, 2020, from USA TODAY website: https://www.usatoday.com/story/news/politics/2020/04/22/coronavirus-trump-suspends-immigration-into-u-s/2994894001/