Sentiment analysis with NLTK /VADER — Comments on Lee Hsien Loong’s Facebook post

Too lazy to read through 2000 comments on Facebook? You’ve come to the right place. Here I’ll use NLTK’s VADER (a Python module) to sift through these comments and see what the hive mind thinks. Warning: it’s surprisingly cheery, like this gif:

Spoiler: the results are positive.


Having grown up here, besides my intrinsic interest in understanding this and extreme laziness — who has the time to read 2000 comments on Facebook?! — I also wanted to see if using NLTK and running the analysis 1 month later would yield different results.

The post in question:

Image for post
Image for post


  • simple natural language processing through NLTK and VADER to classify comments as positive/negative/neutral

Part 1: Scraping

Image for post
Image for post
Your access token is in another castle!

Code for scraping

import requestsgraph_api_version = 'v2.9'
# paste your access token below
access_token = ' '
# LHL's Facebook user id
user_id = '125845680811480'
# the id of LHL's response post at
post_id = '1505690826160285'
# the graph API endpoint for comments on LHL's post
url = '{}/{}_{}/comments'.format(graph_api_version, user_id, post_id)
comments = []r = requests.get(url, params={'access_token': access_token})
while True:
data = r.json()
# catch errors returned by the Graph API
if 'error' in data:
raise Exception(data['error']['message'])
# append the text of each comment into the comments list
for comment in data['data']:
# remove line breaks in each comment
text = comment['message'].replace('\n', ' ')
print('got {} comments'.format(len(data['data'])))# check if there are more comments
if 'paging' in data and 'next' in data['paging']:
r = requests.get(data['paging']['next'])
# save the comments to a file
with open('comments.txt', 'w', encoding='utf-8') as f:
for comment in comments:
f.write(comment + '\n')

This gives me a text file with one comment on each row.

Image for post
Image for post
First comment … …

Looking through the Facebook page and comparing it with the scraped comments, the symbols in the text file are usually either comments in Mandarin or emojis.

Part 2: Quick & Dirty Sentiment Analysis

Image for post
Image for post
Testing took quite a while, thanks to VADER’s missing lexicon

Final code for sentiment analysis

import nltk # be sure to have stopwords installed for this using nltk.download_shell()
import pandas as pd
import string
messages = [line.rstrip() for line in open("filepath goes here")]from nltk.sentiment.vader import SentimentIntensityAnalyzer
# install Vader and make sure you download the lexicon as well
sid = SentimentIntensityAnalyzer()
# this step will return an error if you have not installed the lexicon
summary = {"positive":0,"neutral":0,"negative":0}
for x in messages:
ss = sid.polarity_scores(x)
if ss["compound"] == 0.0:
summary["neutral"] +=1
elif ss["compound"] > 0.0:
summary["positive"] +=1
summary["negative"] +=1

You should get:

{'positive': 1206, 'neutral': 601, 'negative': 270}
Image for post
Image for post
Pie chart generated with Excel in approx 5 seconds… Matplotlib another day

Using this method, with very few lines of code and for absolutely free, I was able to analyse a similar volume of comments.

However, the results were quite different. Instead of 68% positive, VADER found only 58% of comments were positive; also, instead of 18% negative, VADER was surprisingly upbeat finding only 13% of comments negative.

And we are dun dun done.

Odds & Ends

Could the differences in my results vs Jiayu’s be due to time, with the later comments shading towards incrementally positive and neutral, or is this due to VADER being ever more sunny than Google? (I am totally loving the pun. Can you tell?)


Written by

Nerd runner

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store