Four Reasons Sentiment Analysis is Misinterpreted

Sam Hames
Sam Hames
Mar 7, 2018 · 5 min read

Sentiment analysis of text data aims to automatically infer the sentiment or emotion the writer is expressing. In the most common use of sentiment analysis the aim is to determine whether what is written expresses positive, negative, neutral, or perhaps even mixed emotional content by the writer.

For example, reviews of a restaurant could be labelled as follows:

  • Positive: “Fantastic and quick service, it barely took any time to get my food.”
  • Negative: “I hated my burger, there was no seasoning, and I had to wait!”
  • Mixed: “The food was really great but the service was pretty lousy.”
  • Neutral: “The food was alright for the price, a little bland if anything else.”

Sentiment analysis is a powerful tool; when used with care it can give deep insights into the thoughts and feelings of your customers. Knowing which of your customers are happy and which are sad (or indifferent) is a critical insight. However, when used without care sentiment analysis might throw you off the track of really understanding your customers — in this blog post we will look at some of the pitfalls to be aware of.

1. Expecting perfection from an automated process

No matter what tool or approach is used, the power of sentiment analysis comes from the automation: for very little effort you can obtain structured information across your dataset. But like all automated approaches, sentiment analysis will make mistakes. If you don’t plan or aren’t aware of these cases, you will be in for some surprises.

The complexity of human language means there are lots of areas where sentiment may not be accurately detected:

  • : if you say “That’s a great idea!” sarcastically that is a very strong negative sentiment.
  • : “I’m certainly not on board with this new store layout”.
  • : beyond their dictionary definitions words carry a wealth of emotional meanings. Describing a salesperson as “oily” has strong negative connotations that aren’t inherent in the meaning of the word.
  • : every domain has its own unique vocabulary including new words, abbreviations, or different and very specific meanings for more common words.

Google’s natural language API includes sentiment, and they have a demo page here: https://cloud.google.com/natural-language/ If you try it out yourself you will find it very easy to come up with your own examples where the intent of the sentence and the label are completely opposed. For example: “I did not come across a single unhelpful staff member!” is a positive statement, but will be labelled as negative.

The practical impact of these errors is that quantitative results from a sentiment analysis need to be interpreted with care. If one segment of your data is 16% positive and another segment is 18% positive, this may not be a meaningful difference. Conversely, if one segment is 35% positive and another is 3% positive, that could be either a very interesting insight into your data… or an indication that there is a bias or failure in your sentiment analysis tool.

2. Asking the wrong questions

Many of the uses for sentiment analysis focus on extracting information from sources like social media or news articles where the text we are analysing is not solicited for a specific purpose. If you are soliciting feedback from your customers, you need to keep the framing of the questions you’re asking in mind to really be able to work with your sentiment results.

One example would be asking multiple questions about different aspects of the experience for your customer: it should be no surprise that the responses to “What did you like about the new store layout?” will have a more positive sentiment measurement than a neutral question like “How was your experience with the new store layout?”.

A more subtle problem is to ask people to make suggestions or solicit improvements: the measured sentiment from these hypothetical questions needs to be handled with care. For example, if a customer responds to a survey question with “Helpful staff that were there when I needed them” there is a big difference in interpretation if the question was “How was your experience in the store?” or “How could your experience in the store be improved?”. For the former question, the response is positive, for the latter, the response only implies that the actual in-store experience was not positive.

3. Skipping quality control

An automated tool gives you the power to measure and quantify: but it doesn’t absolve you of the responsibility to dive deep and understand what is going on in your data. If you’re going to use sentiment analysis you need to confirm that the results of the model are appropriate for your data and critically interpret the results.

There’s no substitute for reading responses in this case. You certainly don’t need to read them all (what would be the point of automation then?) but if you’re going to use sentiment to make decisions you should verify that it works appropriately for your data. At a minimum you should spot check the labels on a subset of results. In particular you want to know: Are the labels accurate? Are they capturing the full complexity of what’s going on with the responses?

If some of the automated labels seem a little off, you will need to be careful in how you interpret any numerical results, and small differences in sentiment should not be considered important. In the worst case, the sentiment model may simply not work on your data. Using sentiment as a metric in this case is going to lead to bad decisions and it would be inappropriate to report the results of any sentiment analysis.

4. Not using all the data you have

If you have a recommendation score, satisfaction score or other rating, you should always be looking at concrete data first! These are scores directly from your customers, not from a model derived on other data that may not completely transfer to your domain.

Sentiment in this case is a useful supplement and can be used to really dive deep into understanding your customers, particularly looking for discrepancies between sentiment scores and satisfaction scores can give you areas for concrete improvement. But if you’re not going to use the satisfaction scores at all, there is no point wasting your customer’s time by asking them to answer the question in the first place.

Takeway Messages

Identifying sentiment is a hard problem, and needs to be interpreted with due care and attention to detail. Sentiment is not always an appropriate tool to use for analysing data: just because it’s possible to run sentiment analysis, doesn’t mean it’s an appropriate metric. But when used carefully it can give you a powerful additional metric to understand your customers.

In a future post we will look at sentiment in more detail, and in particular how you can make the best use of it from the very beginning of your project.


Kapiche was built for companies who want to empathize with their customers at scale. Our multi-structured analytics software helps companies truly understand what their customers are trying to tell them. Learn more at kapiche.com.

Sam Hames

Written by

Sam Hames

Technical Lead at Kapiche. Machine learning (but not AI), text and image analytics, PhD candidate.