Sentiment Analysis With Bag of Words
Sentiment analysis is the process of determining whether a piece of text is positive or negative. It allows businesses to identify their customer’s sentiments towards products or services through reviews and online feedback.
For Example:
In other words, sentiment analysis gives an opportunity to explore the mindset of your customers and study the state of your product or service from your customer’s point of view.
This makes sentiment analysis a great tool for
- Product reviews
- Market research
- Customer service
- Social media monitoring
- Reputation management etc.,
Sentiment analysis is a filed of Natural language processing(NLP).
INTRODUCTION
The dataset which we are using contains, restaurant reviews and we are going to use sentiment analysis to find wheater a particular review is positive or negative. If the review is negative we will display the following message.
Thank you <customer first name> for taking out the time to write a review. we apologize for the inconvenience caused. We hope we will get a chance to serve you better in the near future.
If the review is positive we will display the following message.
<Customer first name>, Thank you so much for that awesome review. We look forward to serving you again in the near future.
Steps to be Followed
- Importing the libraries.
- Importing the dataset.
- Cleaning the data.
- Creating the Bag of words.
- Training and classification.
- Confusion matrix.
- Predicting a customer review.
Step1: Importing the Libraries
Step2: Importing the Dataset
Step3: Cleaning the data
From the above dataset output, we will find some information that does not help in determining whether a review is positive or negative.
For example, words like a, an, the, was, on, etc, doesn't have any impact on the decision. These words are called Stopwords. We also have words like loved, stopped, loving, etc, we will have to convert them to its root form.
Step4: Creating a bag of words
The bag of words is the simplest form of text representation in numbers. To learn more about bag of words — click here
Step5: Training and Classification
We have to split our dataset into training and testing sets, and then we have to apply the classification model.
Classification Algorithms
- Linear classifier: Logistic regression, Naive Bayes classifier
- Nearest neighbors.
- Support vector machine.
- Decision tree.
- Random forest etc..,
We can use any one of them, I tried all and surprisingly Naive Bayes gives the best accuracy.
Step6: Classification model
It is a tool, used for evaluation of the performance of a classification model. If you want to learn more about this — Click here.
The accuracy of our model is 73%, which means 73% of the predictions are accurate.
Step7: Final step - Predicting a customer review