Trump Reddit Comments Word Cloud

Presidential elections and Donald Trump is a hot topic of discussion in the US right now so wouldn’t it be interesting to visualize what words people are associating ‘trump’ with?

What do we need to create this visualization?

  1. Large publicly available database of comments about trump
  2. A database with SQL interface that can handle large volume of data
  3. Data visualization tool to create the viz and share it publicly

There are only two publicly available large data sets that can be queried for this analysis: Twitter and Reddit comments. For this visualization, I decided to use Reddit commnets because of the ability to query by SQL and ready availability of data in BigQuery. BigQuery is Google’s cloud based massively parallel database with SQL interface with response time in seconds. Reddit poster who goes by fhoffa has made the word association SQL available in the /r/bigquery subreddit. The Reddit comments data is from January 2016. The query returns the list of words “trump” is associated with compared to a baseline ( in this case, the words “common”, and “but”). The next task was to modify the query for ‘trump’ specific analysis, export the results in CSV, and start the Tableau magic.

The list of steps required to create a word cloud in Tableau are demonstrated in the attached animated GIF and the interactive Tableau dashboard and SQL code is available at Vizually Labs.

I will let the readers look at the word cloud and decide for themselves what people are saying about ‘trump’ :-)

