Sentiment and text analysis

10 min readMar 12, 2016

I will discuss sentiment and text analysis as part of market research. First, I will give an introduction what it’s all about, then I will talk about software simplifying this process and finally, I will discuss what the future could bring us.

Sentiment and text analysis

We have a lot of conversations on a daily base both offline and online. People are ventilating their thoughts on social media and websites and thus giving us useful insights. We can describe those thoughts, opinions or complaints as data and you don’t need to do anything, it’s just there. However, the hard part is to see patterns, to bring all of that data together and to analyse it. You would be a very busy researcher if you had to do this manually and this is where sentiment and text analysis software comes in handy.

Kristian Bannister from Brandwatch describes sentiment analysis as ‘the process of determining the emotional tone behind a series of words, used to gain an understanding of the attitudes, opinions and emotions expressed within an online mention.’ On Wikipedia you read that ‘it identifies and extract information in source materials.’ To summarize and link it with market research, it gives you a lot of valuable information, you learn about consumers or potential consumers and build up profiles. As a company you can monitor what’s happening online and if somebody is expressing a certain sentiment, you can act immediately and ask for additional information if you want by popping up the ‘why’ question. This creates an environment of constant learning.

Text analysis or text/data mining is the process of deriving high-quality information from text. To give an example: company x brought out a new product and because of the huge fanbase a lot of their fans started to write blog posts with their experiences, reviews and so on. They are your fans and your consumers and they are giving a lot of feedback. Those comments are valuable and using text analysis can contribute in finding the information and other things you need.

Another reason why sentiment and text analysis can be of value is linked with a blogpost from Seth Godin. In his post ‘surveys and focus groups’ he talked about people answering questions when asked and that this doesn’t always give you a good insight in what people are really thinking at that moment. He talked about just watching people as they perform a task and with sentiment and text analysis, you just monitor what people are saying about a product without them being asked to do so. This is much stronger and you’ll probably get the real opinions, not the produced ones. Social media made people comment more freely (although that’s not always a good thing), so it would be a good idea to follow what they are saying. We are bombarded with comments, just take a look at the ‘one minute on the internet’ statistics, so the use of software would come in handy. In the next part, I will take a closer look at some problems and then at some tools giving insights in sentiment and text analysis.

Problems facing sentiment and text analysis

The human language is complex and that’s the reason why sentiment and text analysis isn’t easy. Just think about small nuances or people being sarcastic. Brandwatch gives an example with the following sentence: ‘My flight’s been delayed. Brilliant!’ If you or I need to analyse this, it’s clear that this isn’t a positive experience, so we immediately see that context is very important to understand the motives of someone posting this on for instance Twitter. Another example is this sentence: ‘This watch is anything but useful.’ Again, ‘useful’ has a positive connotation, but the context is changing the meaning completely.

Talking about Twitter, with only 140 character you sometimes need to be creative when you want to write about something, using abbreviations and so on. Nowadays it seems that a lot of people aren’t really concerned about spelling anymore. All of that makes it a challenging task to let software interpret the meaning (or the hidden meaning) of text.

Normally a good piece of written text is structured into different parts. It’s possible that you’re talking about something at the beginning and later mention that same topic again, but now giving it a totally other view and attributing other sentiments towards it.

Another problem can be the quality and quantity of data. This is explained in a post on ‘datasciencecentral.com’. It’s possible that we can’t come up with a good sentiment and text analysis because we can’t get meaningful results from small data sets. You need to have volume and variety to make the software you’re using smarter; something that’s called machine learning.

Tools giving insights in sentiment and text analysis

Brandwatch Analytics

Brandwatch is using a rules-based process, that way it’s possible to understand the ways context can affect sentiment. To simplify, they define words that are positive or negative and use rules to decide if the context affect those words. The Brandwatch software can detect the difference between ‘I want a burrito so bad’ and ‘I just had a burrito. It was so bad.’

Buzzlogix text analysis API

A key component of the text analysis API from Buzzlogix is the use of natural language processing, this helps to understand the languages people are using when interacting with computers. To specify, their API can analyse blog posts, websites and newspapers. Sentiment analysis is an important part and it checks whether someone is talking in a negative, neutral or positive way about your product or service. They are also checking gender identification (they know if a man or a woman wrote the text) and like I already mentioned, it’s therefore useful in setting up consumer profiling.

There are other APIs as well, think about the prediction API from Google. To create a sentiment analysis model, you need to: collect data (and Google is saying a lot!), label your data (for each collected text string, you need to assign a ‘sentiment label’, so whether the expression is categorised as sad, exciting, boring, and so on), prepare your data (create a CSV file, you can add some underlying patterns like the length of the text, the time of day and so on), upload data to Google Cloud Storage, train a model with the API, start making predictions and the final step: continue to improve your model by adding some other examples. I guess it will take quite some time to create a solid working prediction model.

OdinText

They call themselves the next generation in text analytics. First generation text analytics tools have a rules-based approach (look at Brandwatch again). The problem is that those rules don’t transfer well between industries, categories and various types of data and thus require a lot of time and a lot of money. Focusing on text data only isn’t the way, instead text analytics should be data agnostic, which means that the software is able to work with various types of data, structured (think of databases) or unstructured (text). This is what OdinText is doing.

Their approach is different than others, because they are using linguistic methods as well as mathematics to analyse a text, doing so they filter out the noise or irrelevant data. The software doesn’t analyse a text sentence by sentence, it looks at all available data together and to numeric data too, making the process much faster.

The relation between the variables ‘time’ and ‘analytical payback’ is different when looking at text analytics tools of the first generation and OdinText. For the first generation, it takes a very long time to grow the analytical payback, especially because common human language is breaking more rules than it follows. First generation tools are focusing too much on details in a text and therefore missing connections. Like mentioned earlier, OdinText is using their own mathematical approach together with a linguistic approach so the analytical payback is growing much faster over time.

And last but not least, OdinText can track not just sentiments, but also emotions such as anger, fear, trust, etc.

Quid

On their website you read: ‘Quid is a platform that searches, analyses and visualizes the world’s collective intelligence to help answer strategic questions. It’s quite amazing when you see all of its capabilities: knowledge download, brand perception, competitive intelligence, market landscape and trend analysis.

With the brand perception functionality, you can follow what consumers are saying about your brand, products and competitors. You’re also able to explore conversations surrounding a specific company or market.

Project Oxford, beyond text, analysing sentiments on photos and videos

Microsoft launched Project Oxford, they have a pretty cool website where you can upload a photo and then be told what emotions the person on that photo is showing. The software is based upon artificial intelligence and facial recognition.

On ‘hngn.com’ you read that it works by breaking down the face in eight emotional states and attributing a number between 0 and 1 to identify if you’re showing that specific emotion or not. They also talk about possibilities for market research, like monitoring a group of people watching your commercial or testing a product. We can link this again with the blogpost from Seth Godin and the difference between your spontaneous feelings and thoughts when you watch something versus watching a commercial and then be asked what you felt. That last situation will make you start thinking and it’s not always the correct view you’re giving, you ‘betray’ yourself and ultimately the brand as well. Think about a situation where you had to give a score on 10, first you thought of giving a 4, but after some reconsideration, you actually give a 6. Yes, you know this happened, but when they analyse your facial expressions you can’t hide anymore.

People are using a lot of photos and videos, think about Instagram and Vines and they’re expressing thoughts and feelings by doing so. We’re not only using photos, but also emojis to indicate what we’re really thinking. According to ‘adweek.com’ ‘emojis can account for up to 60 percent of text on Instagram’. And all of that is valuable data.

Future of sentiment analysis

I’ve talked about the problems of sentiment and text analysis, but with machine learning, we’re making the software better at understanding people. On Wikipedia you read: ‘Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.’ In other words, you’re feeding the software with data and by time the algorithm is getting smarter so it can make more accurate predictions. I’m sure that the sentiment and text analysis tools available will only become smarter over time and in the end they won’t have (almost) any problems understanding the true meaning of a text online.

Sentiment and text analysis will be introduced as a service. Phil Wolff wrote this on Quora and I believe this will happen. It will be built into apps and other services.

It will become easier to create consumer profiles and the process will go a lot faster as well. That way, you can make better decisions when you want to create groups to work with and go a bit deeper into the matters.

Not only text, but also photos, videos, emojis and so on, software will gather all of that information and analyse the emotions in real time.

Sentiment analysis isn’t just about analysing a comment and stating if it’s a positive or negative one. I read an interesting article on ‘fastcompany.com’ by Sarah Kessler. In that article she is saying that social analytics firms have moved to supplement sentiment analysis with other metrics. An example is demographic information about who is posting or influencing conversations. Not only do you need to analyse ‘what’ somebody said, but also ‘who’ it said. The same word can have a totally different meaning for either a 13-year old child or a 45-year old adult (Kessler gave ‘killer’ as example). Companies like Crimson Hexagon are doing this already and in the future ‘linking’ will be the keyword. Linking comments with context, other conversations, profiles, trends, etc.

Conclusion

It’s true that there are challenges and that sentiment and text analysis probably can’t be used by every company or brand (I would add: at this moment). If it is possible though, you can’t ignore this as it adds extra value to your research. You can monitor the topics that people find important, you’re able to strengthen consumer profiling and you just learn.

The web is full of data, but it’s not always easy to visualize it as a whole. I believe software like Quid will become very important and other initiatives will follow. APIs like Buzzlogix and the one from Google will become more and more scalable in a way that even smaller companies will be able to set up a decent sentiment and text analysis tool.

To give an example why this can be so powerful, even on some small things, I will talk about what the City Football Group did. With Crimson Hexagon they tracked the music preferences of the New York City fans (what are your fans talking about on the topic of music and what do they like?) and used those insights to play it at pre-game and halftime. By tracking those preferences, they learned more about the profile of their fans and it helped them becoming more relevant for their consumers. And after the introduction, the process can continue: you monitor people’s sentiments about the choices, learn about their feedback and make changes if necessary (music too loud, not enough, quality and so on, rest assured, they will comment about that online).

It’s a very interesting market to follow in the future.

Sentiment and text analysis

Sentiment and text analysis

Tools giving insights in sentiment and text analysis

Project Oxford, beyond text, analysing sentiments on photos and videos

Future of sentiment analysis

Conclusion

Written by Jonas Bogaert