Sentiment Analysis From Scratch With Logistic Regression

Omar Boufeloussen
6 min readJul 25, 2020

Years ago, it was impossible for machines to make text translation, text summarization, speech recognition, etc. An application of question answering system or chatbot would be like magic and hard to implement before the rise of what we call machine learning and especially natural language processing (NLP) which considered as a subfield of machine learning that deals with language and aims to push machines to understand and interpret languages in a human level of understanding. One of the hottest applications of NLP is sentiment analysis that allows us to classify a text, tweet or comment either positive, neutral or negative. For example, to evaluate people’s satisfaction about a specific product, we apply sentiment analysis on reviews and calculate the percent of positive and negative reviews.

https://www.kdnuggets.com/2018/03/5-things-sentiment-analysis-classification.html

In this tutorial we’d do something like that building a sentiment classifier from scratch based on logistic regression, and we’ll train it on a corpus of tweets, thus we’ll cover :

Text processing

Features extraction

Sentiment classifier

Training & evaluating the sentiment classifier

Text processing

First, we’ll use Natural Language Toolkit (NLTK), it’s an open source python library, it has a…

--

--