Text, Sentiment and Social Analysis

Our future of marketing

Technology is changing our world. One way which this has occurred has been through the gathering and analysis of large amounts of data. For some, crunching numbers has never been so rewarding. We now can predict large tropical storms before they hit land, can estimate the likelihood of a disease outbreak, and can even trace how consumer sentiment shifts according to product releases and market changes. Sentiment analysis as well as text mining, can drastically change your marketing strategy by providing you with detailed information in ways that were once unimaginable. While sentiment analysis is not particularly new in marketing, the tools and precision with which we can carry out sentiment analysis have considerably changed. We used to only possess a handful of reviews or consumer opinions about a product. In some cases, we had around 50 to 100. It was possible to read them all and try to draw inferences from what we read. But nowadays, we have tens of thousands, and in some cases, millions of different consumer opinions. This post will describe how we are now able to trace how consumer opinions and feelings about your product change through space and time.

Text mining is a special type of strategy and practice that applies the principles of data mining to text. It is an automated process that helps us to detect and reveal previously uncovered patterns of textual data. Sentiment analysis, on the other hand, helps us to extract the attitudes, moods, and opinions of individuals and groups from text data and content. Sentiment analysis is most commonly applied to sentences as well as MicroBlogs. Microblogs are basically small messages (149 or less characters) comprised of texts such as Facebook posts or Twitter tweets. Sentiment analysis can also be applied to an entire document to assess the popularity of a given viewpoint or opinion. When both text mining and sentiment analysis are combined through analyzing social media behavior, we are provided with a significant amount of descriptive and predictive power.

Social media is used by billions of people on a daily basis. Since most people communicate over social networking sites, we now have access to an abundant amount of potentially relevant information. Facebook and Twitter, among other networks generate thousands of text-based posts each second of the day. The changing ability to be able to engage in in-depth analysis of consumer opinions should render old marketing strategies as outdated. Text mining and sentiment analysis can provide us information about what consumers think about a specific product, or a big brand such as Samsung or Microsoft. What is crucial to consider is that these two tools are not only limited to tracing and identifying popular hashtags. We all know that companies use hashtags on their ads to encourage social media engagement. When consumers use hashtags, they are contributing to the media conversation around your campaign and brand. However, reading posts that contain a relevant hashtag can only get us so far. Sentiment analysis and text mining can provide us with information regarding the feelings, opinions, and emotions that individuals have towards your product. This can be accomplished without paying attention or even using hashtags.

The process begins with the text collection and text cleaning stages. Here social media networks are accessed according to your desired topic. Usually this is done through an R package called twitteR: here you can access twitter messages through using keyword searches. For example, if you are interested in what potential consumers and customers are saying about a product that your company is selling, you can discover thousands of expressions that users are voicing, such as “I love my new Note 6.” Once you have collected a sufficient amount of information, the text that you gathered in early stages then goes to the preprocessing stage and then to sentiment analysis. Sentiment analysis combines machine learning and other data techniques. It provides you with useful output.

Once you have your output, it is useful to plot your data on graphs to get a sense of what is going on. For example, a simple bar graph or an X-Y plot can be created according to a classification of emotions. Are consumer responses happy? Sad? Angry? Or Confused? Typically, a sentiment score gets generated which is based on tens of thousands of microblog posts. Here you can summarize the opinions and feelings consumers have on products by telling the computer to assign phrases and words of profanity to a lower sentiment score, while at the same time, you can assign words such as “thrilled,” “lovely,” or “wonderful” to a high sentiment score.

Another useful aspect of sentiment analysis and text mining has to do with the ability to be able to track trends according to specific time frames and geographic locations. This is arguably the most powerful characteristic that data science has provided to improve our marketing abilities. After the launch of your product, or during an upgrade, or even during a sale/discount period, you can mark a temporal threshold of interest. For example, if you want to know what users and consumers say in the two to three hours after your product is launched, you can do so. On the other hand, you can also gauge a longer period such as 24 or 48 hours and then compare short versus long. During the analysis of sentiment and text in each classified temporal period, you can spot spikes in mentions across different social media platforms, and then you focus on only negative responses in order to understand their origin.

It might be the case that users with a lot of followers (1000+), or influential product reviewers left a negative review for your product, and in turn, this led to a diffusion process. Their negative review could have been Retweeted, or shared by hundreds. This could potentially be an origin of negative product sentiment. In some cases, identifying negativity of this sort might actually provide you with a qualitatively powerful summary of negative consumer response. It might be the case that an influential person reviewed your product, and this led to many Retweets, but many users actually still did like your product, hence, there is not as much negativity as the data shows. If this turns out to be the case, I recommend that you contact that influential person (possibly post on his/her blog), and politely comment to gauge why they responded to your product with negativity.


For interesting case studies check out: How Data Science Helps Marketing.