Sentiment Analysis From Scratch With Logistic Regression
Years ago, it was impossible for machines to make text translation, text summarization, speech recognition, etc. An application of question answering system or chatbot would be like magic and hard to implement before the rise of what we call machine learning and especially natural language processing (NLP) which considered as a subfield of machine learning that deals with language and aims to push machines to understand and interpret languages in a human level of understanding. One of the hottest applications of NLP is sentiment analysis that allows us to classify a text, tweet or comment either positive, neutral or negative. For example, to evaluate people’s satisfaction about a specific product, we apply sentiment analysis on reviews and calculate the percent of positive and negative reviews.
In this tutorial we’d do something like that building a sentiment classifier from scratch based on logistic regression, and we’ll train it on a corpus of tweets, thus we’ll cover :
Text processing
Features extraction
Sentiment classifier
Training & evaluating the sentiment classifier
Text processing
First, we’ll use Natural Language Toolkit (NLTK), it’s an open source python library, it has a…