How Negative or Positive Are the Words We Consume In the Daily News?

With recent events in Syria the news seems to be a bit more negative lately. Exactly how negative or positive is the news surrounding this particular event?

Alyssa Branch
5 min readOct 16, 2019

The Trump Administration made some quick decisions recently regarding the U.S. military’s position in Syria. This chain of events had many repercussions and spread quickly all over the news. I am a strong believer in the influence of news media in our daily lives, so finding a balance between keeping myself educated on current events and remaining positive and happy consists of some challenges. Therefore, I believe the sentiment of a particular piece can greatly impact moods and opinions.

As this story exploded in the US news media, even the titles and quick descriptions were packed with persuasion and sometimes biases, not necessarily negative biases though. For example the New York Times published an article — “Turkey Launched Offensive Against U.S.-Backed Syrian Militia.” This creates draw for American readers as it uses words like “Offensive” and “U.S.-Backed.” This title, and story has some negativity at first glance but also seems rather neutral. So, I wanted to know exactly how negative or positive the news is surrounding this particular event and how has it changed over the few days following the initial decision?

To begin, I want to state that programming being so open ended is great because many people can meet the same conclusion numerous different ways. For me, my knowledge on Python is still a bit limited so some methods used in this process may not be the most efficient and one method is not my own.

When I started looking at this data I wanted to use an API to look and gauge the negativity or positivity of the title and small description given. However, I decided later to use the API to extract the URLs then pull the HTML off of the site. I began with News API and GNews API, unfortunately I could never figure out why GNews was only returning 10 articles without options for pagination. So I stuck with the NewsAPI as their pagination and other documentation was much easier to understand and their requests limit was 500, well over what I needed.

I recalled a process one of my professors always encouraged using, Sentiment Analysis. I really like this because it can tell you the overall feeling of a word, sentence or in this case, news article. I used a tutorial blog post to learn how to perform the sentiment analysis using libraries.

For this, however, I only used the TextBlob as the other two are for extracting URLs and HTML which I planned to do through the API .

So from there I started to write my chunk of code, which was originally designed to be a function but as it went on the necessity of it seemed less important. To start, I created some empty dictionaries. I then accessed the NewsAPI with ‘requests.get’ and used someparameters in the url: “domains=nytimes.com&q=Syria&from=2019–10–07&to=2019–10–15&sortBy=publishedAt&pageSize=51” This filters the results to come from the New York Times only, contain the keyword ‘Syria’, have been published between October 7, 2019 and October 15, 2019, to be sorted by their publication date and finally, the number of results I wanted to be returned.

I ran into some problems in the beginning stages of my coding with the API as it limits the number of requests you can make in a 12 hour period. I kept running into this limit when I was testing the pagination, so I stuck with a lower number — 50. This reached the date back to the 13th of October, although my goal was the 7th when the decision was first announced. However, after messing around with the dates I realized looking at the most recent 50 actually captured the sentiment of most of the New York Times articles pertaining to this topic, even in the days prior.

I then grabbed the URL and the publication date/timestamp from the API and stored them in a dictionary for easy access later on. I used requests once more to get the text from the corresponding URL and BeautifulSoup to retrieve the HTML. I searched through the tags and found the one which wrapped the text in NYT articles. I stripped the text of the tags throughout the article using ‘.lstrip’ and ‘.replace’ yet I’m sure there’s a better way to do this. Finally, I used the sentiment analysis process to build my dictionary of dates published and corresponding sentiments for each article extracted.

Code described in above paragraphs

From there I wanted to visualize my data to truly find out what the overall sentiment of these articles are and how they changed throughout the few days. I converted my dictionary into a DataFrame, reindexed the columns for the date to be it’s own column instead of the row names, and then sorted the DateFrame by the Date/Timestamp.

Dictionary to a DataFrame — described above.

I used factorplots and catplots in Python to analyze my data a little further and decide how to best visualize data like this. I converted to Tableau to finalize the data in a more presentable, easy to read visual.

To my surprise, it seems the data has remained rather neutral over the past few days. Although the plots seems to jump around a bit it’s actually not saying much in terms of the range. They mostly hover just above 0 with a good amount actually equating to 0 and just a small few dipping below into the negatives. However, 1 being the most positive rating and -1 the most negative, the range of about -.05 to .18 isn’t really positive or negative at all. So overall I’d say the articles from the past few days surrounding the conflict in Syria were sentimentally neutral with a slight lean toward positive.

In order to further this work I would first clean up the data as much as possible. I know the text contained at least one ‘a’ tag in the article but it was such a small portion of the text I decided to leave it for this purpose. I could also remove all relatively neutral unimportant words like ‘the’, ‘a’, ‘and’, ‘this/that’, ‘it/it’s’, etc. That way it leaves all of the weighted words to perform the sentiment analysis on. From there I would clean my code drastically — maybe learn new methods and functions to perform the same tasks in a more efficient process.

--

--