Sentiment Analysis in 3 Minutes, Intro to Natural Language Processing.

Harsh Patel
Backyard Programmers
3 min readAug 14, 2020

Perl was designed to work more like a natural language. It’s a little more complicated but there are more shortcuts, and once you learned the language, it’s more expressive.

When I initially heard of the term Natural Language Processing, I was like “Wait what….. how is it possible that machines can understand and interpret what we say”. Then after a few google searches, I came to know that it is nothing but a lot of text data and maths.

Natural Language Processing is divided into:-

  • Morphological and Lexical Analysis.
  • Syntactic Analysis.
  • Semantic Analysis.
  • Sentiment Analysis.
  • Pragmatic Analysis.

So, Today we are going to talk about the Easiest one “Sentiment Analysis”.

Sentiment Analysis is an analysis technique to get the emotions of the person by the text he wrote.

Eg:-

“This Place is not Good” This line has a Negative Sentiment.

“This Place is Good” has a Positive Sentiment.

So Predicting these Sentiments is called Sentiment Analysis. It can be of more types like happy or not happy, Happy Angry or Surpised.

Coming to The Practical Part:-

  1. There are various steps in Data Preprocessing for Natural Language Processing:-

→ First is the Collection of data.

→ Removing the Punctuations in the Text, But removing punctuations is not always good because it also carries emotions in text, so it totally depends upon the user and data what is it representing.

→ Removing Stop Words from the Text like “and, or, etc”, as it doesn't carry any sentiment with it.

→ Stemming of the data, Stemming of text is “Stemmers remove morphological affixes from words, leaving only the word stem.”

eg:- Working, Worked, Work →(Stemming) →Work

here you can see that all the words carry the same meaning with it so including all the terms in our dictionary will make our model slow, instead, we can stem the words and replace them with their stemmed words.

Here Your Data PreProcessing is Complete…….

From here 60% of your work is complete.

Now What you have to do Is to Create a Frequency Table of Repetition of words in your text. Don't worry Tfidf Vectorizer will do the work for you in one line of code just initialize it and run it with your text data.

Here this will create your Frequency table of words with the Sentiment it carries, Create a Pipeline with your preferred Classification Technique, and Tfidf Vectorizer.

Train the Model with your Text Data and Check With Test Data….. That’s all. :)

My Jupyter Notebook for This Sentiment Analysis:-

Don’t forget to give us your 👏 !

--

--

Harsh Patel
Backyard Programmers

Mobile Application Developer || Data Science Enthusiast 🎈