Let’s discuss the basics of sentiment analysis and sentiment automation because mining opinions offer instant insights to the overall vibe you’re sending to people in unstructured text.
The goal of building this is to maintain a SIMPLE explanation of sentiment, to show others the logic being used, and help avoid confusion on future state of the art natural language processing tools, which were released this month.
Finding the right sentiment wiki, knowledge base article, or word scoring data source.. is like finding the right puzzle piece, when they are all the same color.
Wiki likes calling it Opinion Mining.
Wiki is a publicly editable data source, that anyone with a computer and internet can edit. (keep that in mind when learning on Wiki) Below, check out how Wiki defines it, and yes it’s going to sound complex.
Which is exactly why you clicked on this article, right?
In this blog, I’m going to cover sentiment analysis, word scoring, sentiment automation, and offer first hand experience, use cases, etc…
Here’s what Wiki says about sentiment analysis, opinion mining, natural language processing, text analysis, computational linguistics, and biometrics..
WIKI — Opinion mining (sometimes known as sentiment analysis or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine.
Sentiment analysis (in my words)
The ability to analyze an infinite amount of content, score it, and discover magic within text.
You can do sentiment analysis in a complex method or simple method.
You can automatically grab text from a website, start scoring each paragraph, or each page on the website. This could be considered complex but luckily it’s only ten lines of code. You can do web scraping using python, requests, and beautifulsoup to begin.
Learning Web Scraping with Python, Requests, & BeautifulSoup
Did you know learning web scraping w/ Python, Requests, and Beautiful Soup is easy...
This is one method of wrangling because…
The data is all over, not clean, and not a flick of the wrist.
(Most importantly, start simple.)
Your unstructured emails, unstructured text messages, unstructured youtube comments, recorded or live voice calls, html riddled web pages, and what I’m typing here…
All of these funky data sources have an overall sentiment analysis or sentiment score.
And that opinion of your brand, from a “comments perspective,” may kick off certain business processes based on the content.
Currently, there’s a lot of sentiment data sources online, and not that complex to fit to your needs.
If you find sentiment scoring or sentiment analysis data sources, clean all of it, filter out weird words, add your own, and share with friends!
Being able to edit and change is essential to your solution because no two companies or industries are the same. (more on this below)
If you have your data structured correctly and prepared for a sentiment scoring solution, you have essentially built a method to automate any similar data source. Thus… automation presents itself, once you have built your own word scoring data source… Hard work incoming?
Oh snap, someone said automation… run away.
If you need to build automation, I know the feels.
Automation isn’t scary and you’re very close to automating everything you do and giving you more time to learn more complex problems to solve with your time. It’s not hard work.
If you’re like me, and want to build lots of sentiment scoring data sources for 1 single sentiment analysis solution. Then you have a lot of flexibility to measure your text in various ways.
Just maybe… sentiment analysis isn’t important.
Maybe you just need a flare gun to go off in your house if someone says the F word on a blog of yours with baby photos. Something like that isn’t sentiment analysis, it’s more of an “oh shit.”
A kind of … Escalation management automation, haha.
It can be processed in your sentiment analysis, to kick off “oh snap, someone said X” to any email, that’s is a great way to protect your brand.
Here’s what to listen for when learning word scoring, sentiment analysis, and sentiment scoring.
- “word scoring” — Not a lot of people say Word Scoring but more technical people have a tendency of saying it, and I’m starting to hear it more often on phone calls.
- “sentiment analysis”—People say sentiment analysis when they want to analyze the sentiment scores of a given set of words.
- “sentiment scoring” — Sentiment scoring can be explained differently based on how you end up scoring the data, maybe it’s very basic Red, Greed, Yellow flags… And that sends emails to people who deal with nasty comments…
A quick sentiment example…
We could automatically look at medium, and find sentiment per article.
Per paragraph, per hashtag, per topic, keyword, category, conditional groups of articles based on the amount of words being typed VS images used, etc…
Imagine getting live feedback like the dashboard below, or knowing what words would make the medium blog have better sentiment? Words we haven’t used before. That would be insightful and unique.
The tableau viz above is a solution I bootstrapped together with free public data sources that offer end end users a lot of different ways to score data.
Because … Why not!?
Bunch of smart people made it, why not use all of these data sources? Harvard, some professors, PHD students, and even hand written scores.
Why not use everything when mining for sentiment related insights?
Maybe combine the scores later and make a SUPER API score? Awesome right? Now plugin an automated scrapping robot in python, and now you are hearing what I’m working on next…
Check out the above tableau viz. Average intensity could mean one thing to one industry, and nothing to another, that’s why I believe sentiment analysis needs to be customizable at the data source level and editable by business users and not just a developer team. But hey I don’t want to make this about my solutions or strategy for adoption, and let’s continue!
Data is never the same.
Sentiment scoring solutions and data sources are never the same and hardly fit together. Above in the data, “During, post, pre” are 3 different sources of unstructured data, coming from Google sheets.
If data is never the same, make it the same.
I built the solution above, to offer a mixture of lots of different sentiment analysis options because no project is the same.
Not every blackbox sentiment scoring solution will work.
Blackbox sentiment scoring solutions are helpful for quick insights… And they will not offer much value, without looking at the data, interpreting the data, and making decisions to optimize your solution because without being able to edit your solution, you’re sitting with a blackbox.
Sentiment analysis is conducted, usually, by a massive team of experts.
Or one smart mofo. Like you or me.
Note: sentiment analysis is only as good as your ability to update the values associated to the scores.
Most, if not all, sentiment scoring tools or data sources are natively a blackbox solution, once you change inside the box, it’s no longer a blackbox, and a custom sentiment scoring tool would be best for anyone eager to mine opinions out of their data.
Because you need to discover how your sentiment stands up against your use case. Which is why it’s important to start easy!
Eventually, you should consider Sentiment analysis for your own personal content.
Grab your magnifying glass.
Finding odd outliers in your text, is a step in the right direction.
Again, no blackbox solution will work for every sentiment scenario, you need a lot of options, a lot!
Blackbox sentiment analysis solutions are not viable solutions because that’s not how things work. You can however use a blackbox sentiment analysis solution to understand massive quantities of data, without breaking a sweat, or needing to get another cup of coffee.
Imagine I wanted to understand what every countries usage of my musicblip website, to understand how customers feel about the website.
My download free loops biz doesn’t have sentiment analytics but what my users say about the website do!
Analyzing what they say about my business would be helpful for future growth but scoring comments or customer surveys manually is complex and a huge waste of money & time.
Sentiment analysis needs more than a blackbox solution.
I do believe explanations of sentiment analysis are written by people who are unfamiliar with practical application of sentiment scoring at basic or enterprise levels, that’s just the Internet right now… it’s close but worth explaining in more clarity.
Sometimes people just need to know if someone’s dropping the ‘f*** bomb’ on their kid friendly YouTube channel, and dispensing of that before they lose fans or get trolled with automation.
Sentiment automation would happen after the journey of testing your sentiment data model and data science model (if that’s the route you decide to go). Automating finding outlier sentiment scores by scraping HTML, connecting to APIs, and kicking of automated work based on these findings would be considered automating sentiment analysis.
The type of computational solution to get to your desired sentiment outcome may differ between every use case.
Your sentiment analysis depends on your use case and the precipitation of value post development.
Here we go, haha.
About my sentiment analysis experience
After architecting a call center voice to text sentiment analysis solution, with automated escalation management based on support representatives average sentiment score, “hey maybe bob is having a bad day.” Well, we can automatically see this happening across thousands of callers, well it’s an infinitely scalable solution and requires a highly capable IoT database. (Or a half decent sql guru, like myself, lol)
So, developing the data consumption and business operation automation… I feel obliged to explain sentiment analysis? Is it safe to say I have some practical experience?
I also built government agency word outlier dashboards using chi square and ratio, which enabled a snap shot into words being used online… and analyzing that sentiment vs usage was key to understanding trends… boom. I know.
So after all that work, and building an automated sentiment scoring tool with 5+ word scoring sources, I feel obliged to “writing” some wrongs, by blogging this, and sharing it with others… sentiment scoring or sentiment analysis needs to be easier — so more people can take advantage of the output, which is simply…
Quick as f*** insights into your unstructured data.
Other things I’ve built:
- Automated comment sentiment analysis
- Automated email sentiment analysis
- Automated “help and insights” email responses based on sentiment score and word usage.
- SEO optimization advice based on competitor sentiment analysis, automated across any link(s)
- Sentiment analysis over words used in articles online, from robot scraping and automated sentiment scoring.
And finding anything contradictory to what sentiment analysis is online… Is suddenly frustrating.
Maybe because I think sentiment analysis and word scoring, is even easier with python, so let’s keep it simple. Hey, I’m Tyler Garrett. Welcome. Let’s talk about sentiment scoring, in a basic break down, and hopefully it helps you dive into the needs.
Below, I am going to quote a website speaking intelligently about sentiment analysis, poorly.
Sentiment analysis uses computational tools to determine the emotional tone behind words. This approach can be important because it allows you to gain an understanding of the attitudes, opinions, and emotions of the people in your data. Source.
IMO, sentiment scoring uses what you need it to use.
It can be super complicated but the end goal is to represent a value per a word, set of word, phrases, paragraphs, essays, etc…
Second, it doesn’t need to be some sort of complex thing, unlike what everyone blogs about.
Sentiment analysis is very simple and can be utilized by business analysts without any technology expertise.
You can used whatever you want to analyze sentiment, making sense of multiple data sources is often challenging and requires endless excel massaging. I come from having automated thousands of hours per week by isolating data problems with big companies and small companies, such as scoring sentiment across text, and build a support free data solution that scales forever. I’m not a wizard, I just try hard to make sense of data.
Make sense of your data. Sentiment Analytics is a start.
Every data source and sentiment scoring device is unique. Sentiment analytics is a great start to understanding unstructured data. You find a wizard like me to structure your unstructured messy data, and then we can give you insights into word usage.
Keep making sense of your sentiment solution. Every industry has different triggers for every word, sometimes a positive word in an email is a negative word to say out loud, simply put — words are complicated to offer a black box sentiment solution.
Without being able to filter your scoring per use case
When I first read this, I thought, “why are you making this sound complex?” But as I start to blog about it, the more I realize it sounds more complex than it is, and thank god for libraries that let you automatically access sentiment scoring capabilities.
Pretend… If you have 100 emails sent from 100 different sales reps, you have 100 sets of unstructured data. Maybe 300 words per email? That’s a lot of data to handle, in an email system most consider a sunk cost in business and contains little to no value from a data analysis perspective.
Unstructured data is like this sentence right here.
Structured data looks like this, in google sheets below.
If we were wanting to analyze the sentiment of the entire sentence - we would need to structure the data. Or find someone who has automated this process, I recently built this solution in an application, but now I’m building it with a free programming language because — why repeat the wheel?
Should have probably swapped it to “structured” in cell A2 😹…
Sentiment analysis is possible in the fashion, word scoring, averaging scores, etc….
If you’re able to transform the data, and have a word scoring datasource available too. And way more than positive or negative, there are scores that go into granular 1 through 5 value breakdowns, handwritten logic by large classrooms of PhD students, all siting, and nested in an unusable state.
Below we offer visual “score” being positive or negative, and hopefully you understand there are high complicated scoring “single word” datasources, but also plenty of beautiful algorithms than can handle phrase, which really blows away my simple negative positive score below…
0.125 saying it’s “mostly positive” or which is logic enough to say this sentence is positive. If we could see 30 sentences side by side… Boom.
Basic Automated Sentiment Findings..
Sometimes people just need to automate over text and find words that simply suck, well there’s a data source for that too. Negative phrases, and fuzzy matching against those, for the rare 1337 speech, or hacker speech as some call it. Where the replace letters with numbers. Like my tech consultancy Dev3lop.
Sentiment analysis can be simple, without complex explanations of simple processes, we enable a new world or even class of thinkers… escape with me, learn sentiment scoring at a basic level, and then let’s do python sentiment scoring below.
Why would you need a word scoring data source in sentiment analysis?
So, you’d want to test it against other structured data sets or use complex algorithms generated by wizards. You need a word scoring datasource to match or join these similar words to each other, and understand the overall score or positive/negative sentiment scores of a given set of words, sentences, web pages, voice conversations, emails, comments, etc.
The end goal would be to analyze all 100 emails from all 100 sales reps at one time, and then use comparative analysis to see what was more positive or negativefor example.
Breaking apart the data in a structured format gives a computer an opportunity to consume unstructured data, and what people, such as myself, do… is automate this restructuring of unstructured datasource, to offer measurable values around content that would not be considered measurable or even valuable.
The raw power of sentiment analysis and opinion mining…
Before we start…
Imagine what it used to be like!
Sentiment scoring has a lot of power because analyzing 100 emails at once would take a very long time, especially if we did it like I did in the spreadsheet screenshot above. Mining the opinions would be manual, painful, and slow.
Imagine now manually finding words, in a word scoring datasource, and manually matching it up against these words… this 100 email task, just turned into a long month, and now we are starting to see the sheer power of being able to quickly score sentiment across unstructured datasources. Written by a data architect, me, who has helped companies with implementing million dollar+ sentiment scoring capabilities, and assisting them visualize the data, and optimize the data solution.
The long short, sentiment scoring helps.
Here’s a paste from a place online.
In their words.
At a higher level, sentiment analysis involves natural language processing and artificial intelligence by taking the text element, transforming it into a format that a machine can read, and using statistics to determine the actual sentiment. Source.
In my words…
Sentiment analysis involves data transformation and relationships, that offer measurable values, in a sea of unstructured content we need sentiment scoring analysis that is not only comprehendible but easy to explain.
Don’t make mining opinions and sentiment analysis difficult, bro.
We should avoid talking about natural language processing and artificial intelligence when discussing a machine reading unstructured data and giving it a type of measurable value, statistics are one thing, scoring words and averaging a paragraph is another…
The quote above, like most things online, should be taken with a grain of salt.
Hopefully you’ve learned enough about mining opinions, sentiment analysis, and a quick brush over automating your future sentiment solution!
Typos by Tyler Garrett
Founder of www.dev3lop.com — consumed in 140 countries.
Founder of www.musicblip.com — in 80 countries.
Austin Photography by Tyler Garrett
(Python sentiment section coming soon!)
(Apologies for mistakes, I wrote this while chasing my son around the mall and hearing him constantly say “ballon, ballon, ballon, ballon.”)