#confirmKavanaughnow

Gloria Serra Coch
3 min readOct 14, 2018

--

R and K live in New York City but they come from America.

-You know what they say…If you leave New York, you know you will have to live in America-

They both argue their hometown has too many Trump supporters and often compete on who had it worse being raised there.

In this post I have collected 50 tweets from each of their hometowns. Whytesville, Virginia for R and Thurmont, Maryland for K, which show a clear support for Kavanaugh, and compared them.

In order to do an appropriate study, more than 50 tweets per town should be collected, this number is too low to achieve any significance. However, my aim was more to develop a methodology for my friends to have material for more arguments than reaching any significant conclusions. And, of course, passing my course Urban Data and Informatics from #GSAPP.

To filter supporting tweets, I have used the hashtag #confirmKavanaughnow. For the location, I have set the coordinates of the center of each town and collected the tweets within a certain area around it. For both cases, the area is pretty big because the towns are located in low density zones. This is a limitation of the study, as the tweets do not necessarly represent the town but the area around the town.

The collection was carried out October 14th 2018. For each tweet, I collected the date, the text of the tweet, the number of people liking the tweet, the number of times it was retweeted, the lenght of the text and a sentiment analysis score of the text.

In order to calculate the score, a specific library from this project was used: https://pypi.org/project/afinn/. (Finn Årup Nielsen, “A new ANEW: evaluation of a word list for sentiment analysis in microblogs”)

In order to write the code, I collaborated with another student in class, Lorraine Liao and used some tutorials, especially the one available in this link: https://galeascience.wordpress.com/2016/03/18/collecting-twitter-data-with-python/

For Whytesville, Virginia, the tweets collected range from Thursday, October 4th to Saturday October 6th. In the case of Thurmont, Maryland, the tweets range from Friday October 5th to Thursday October 11th.

For Virginia, their mean character lenght varies wildly, with a mean of 90.66 and a standard deviation of 42.13. In the case of Maine, the average lenght is longer, (117.3 characters) but the standard deviation is slightly smaller (31.42). This difference could be explained because the mean is closer to the maximum character lenght set by twitter (140 characters).

For Virginia, the mean sentiment analysis score is -0.4 with a standard deviation of 1.76. However, many of the tweets score 0.0, as they only include hashtags or not very significant text.

Histogram of sentiment score for Virginia tweets

In the case of Maine, the mean sentiment analysis score is 0.02, with a bigger standard deviation of 3. In this case, we can also see that many of the tweets score 0, however, the histogram shows a more normalized distribution, with less negative scores.

Histogram of sentiment score for Maryland tweets

In general, from the limited results obtained in this analysis, we could see that people in Virginia tweet less about this topic, which could also be because they tweet less in general, and that the confirmation of Kavanaugh stuck for a longer time in twitter in Maryland.

We could also argue a higher enthusiasm on the topic in Virginia derived from the lenght of the tweets. However, the sentiment analysis score is more negative in Virginia. It is nor clear if that is showing more oppositions against the confirmation of Kavanaugh or negativity regarding all the Kavanaugh issue.

--

--