Data Viz 2017 - What was the news reporting (and did people actually care?)

A lot happened in 2017. Political challenges, natural disasters, power shifts in the Kardashian clan. If you take a moment to reflect on those 12 months, it’s pretty overwhelming to even remember everything that happened. So, I decided to go back and find a way to catalogue it with some data visualization.

I didn’t want to just summarize the year by going back and looking at news headlines. They’re useful, but they only give half the picture. I wanted to see how the news compared with what the public was actually interested in.

I started the project by scraping through google trends week by week all the way through 2017, logging the top 10 items of interest from published news and general search. I had to scrub a good number of items out to get rid of search terms like “weather”.

In the end, I had two spreadsheets with over a thousand points of data.

Portion of the Reported News Topics
Portion of the Public Interest Topics

Before it was even processed, the data revealed some really interesting results. For one, there wasn’t a single week where Trump wasn’t a top 10 item on both of the lists (and the Kardashians on at least one of them). For another, I noticed that Hilary Clinton was a top news item almost throughout the entire year, even though she wasn’t really doing anything news worthy for most of it. When I looked deeper, it turned out that she showed up so often because her name had essentially become weaponized by far-right media and they published stories about her almost constantly.

But the data wouldn’t be too useful unless it could be processed and represented in a digestible format. So I built a quick site and programmed it to pull the content from a randomized week from both lists and do a side by side comparison.

As a final piece of data visualization, I thought it would be interesting to see how all topics from the full year compared. I went with a method that’s a little cliche, but still really effective and made a word cloud for the two spread sheets. Like the comparison above, the one on the left is the public zeitgeist and the one on the right is mainstream news.

At the end of this project, what I found most interesting is how much I hadn’t experienced during the year. There were whole areas of news and public interest I hadn’t been exposed to or were aware of. I think a portion of that goes into how information is presented to us through technology, funneled by algorithms that have learned our preferences. But seeing that displayed so clearly really drove home how important information (and transparency about the information) is to how we see the world. It makes sense why that’s one of the central areas of focus in civic tech; people need access to full information for democracy to work effectively.