Don’t throw good money after bad sentiment

Earlybirdy public beta is now open. Sign up or request a demo.

Today’s suite of “Social Media Analytics” tools have more in common than they’re likely to admit. The real difference between the big, 10,000 feature guys is user interface and experience, and once you’ve taken the time to learn how to do four or five things you’re probably going to be happiest if you just don’t have to learn a new interface. One of the capabilities that they all tout is “Sentiment Analysis.” In this context, sentiment is considered to be defined as Positive/Negative/Neutral. Super helpful right? If you’re forking out the serious cash for one of giant enterprise options, they might also add Sad/Happy/Angry/Joyful. Good thing human emotion is as simple as a few categories and that no one is ever sarcastic! (excluding me, right now)

I’ve been in this space for a while now and I have yet to find anyone who, in a candid conversation, can tell me how this data is helpful or ties to any kind of ROI. The concept of “Sentiment Analysis” is certainly sexy and one would reasonably assume that tracking this over time would help leaders make better decisions. It’s intuitive, but it’s also probably not that accurate. Let’s explore what I mean by that:

Nobody’s Business

Sentiment Analysis is based on a field of academia called Natural Language Processing (NLP). It’s a relatively new field of study, recognized as originating in the early 1950s. For the first 30 or so years of its existence, NLP systems were complex hand written rules that scored text based on its content. What’s now considered ‘Modern NLP’ began in the 80s and is (according to Wikipedia):

“based on machine learning, especially statistical machine learning. The paradigm of machine learning is different from that of most prior attempts at language processing. Prior implementations of language-processing tasks typically involved the direct hand coding of large sets of rules. The machine learning paradigm calls instead for using general learning algorithms — often, although not always, grounded in statistical inference — to automatically learn such rules through the analysis of large corpora of typical real-world examples.”

The problem is that these modern models don’t actually work well and, even more importantly, they don’t have a business goal. The ambition of NLP is to make computers understand content the way a human would — to power Strong AI. Estimates are that NLP is likely another 25+ years away from hitting human parity. Some very bright people argue that 25+ years is more likely 100+ years. At any rate, we’re not close. So why are businesses pouring money into an incomplete and tangential technology? Good question! We’re convinced there is a better way. That’s why Earlybirdy has taken a modern meets back-to-basics approach to content understanding and has created a tool designed to meet business goals.

It takes two to make a thing go right

If you’ve read Peter Thiel’s* Zero to One*, this might strike a chord. Chapter 12 is titled “Man and Machine” and Thiel discusses a critical insight he learned at PayPal that fueled the launch and wild success of Palantir. The insight? “…men and machines are good at fundamentally different things.” Peter argues that humans and machines are poor substitutes, but wonderful complements and that embracing the complementary nature of computers and humans is “the path to building a great business.” This is central to what makes Earlybirdy different and, we think, superior from other tools. Our technology, which we call ‘The Bird Brain,’ is curated and managed exclusively by humans. These humans, however, are supported with machine-crunched data analysis that empowers them to see, process and intuit what would otherwise be impossible.

Google tells me there are 1,025,109.8 words in the English language. The Bird Brain cares about less than 0.10% of them and usually only when they’re used in specific combinations. Why so carefree? It’s not because the other 99.9% hasn’t been evaluated, but because they’re not consistently meaningful. Everyday Earlybirdy indexes the content of 1 million random social posts. If each post is about 20 words, that’s 20 million individual words a day and a make-your-brain-hurt maneuver if you try to figure out the amount of phrases. The computers do this heavy lifting, providing the humans with a daily report on what’s identified as potentially interesting. That’s when a human takes over, evaluating the words or phrases that they believe are likely to possess what we refer to as ‘Valence Certainty.’

Valence Certainty is the nomenclature we use to express that a word or phrase is used with singular meaning and strength of meaning. A word like “love” has very low valence certainty. It is used in many ways: to communicate deep affection, to express fondness, and in sarcastic jest. The phrase “completely in love” has much greater valence certainty. How do we know? No special formula — just some digging around and then making a judgement call. What if we’re wrong? Again, the marriage of computers and humans provides the answer. Every time a human acts on a post, whether responding or marking it irrelevant, the computer takes note and each day generates a report detailing how every item in The Bird Brain performs. Finding and maintaining the best corpus for The Bird Brain is only half the battle — we also need to find the proper weight for each term. Before explaining how a human-hybrid is applied to that, as well, let’s look at how weights might be scored on a particular tweet.

Here’s an example being scored against our “Issues” query which identifies people dealing with a problem:

“@comcast your customer service is amazing …ly bad. an hour of my time and nothing is fixed. just love you guys! :-/”

In this example* we find 4 phrases which The Bird Brain considers when calculating

  • [customer service] matches the attribute “Customer Facing Services” which has a weight of +5
  • [amazing] matches the attribute “Extremely Good” which has a weight of -1
  • [fixed] matches the attribute “Problem” which has a weight of +5
  • [:-/] matches the attribute “Emoji: Unhappy” which has a weight of +0.5

This post has a total score of 9.5 which meets/exceeds the Required Score of 5.0, so this result will be included in the result set for the Issues query. To understand how we assign weights, we look at another daily report. Each day we human categorize X posts and then ask a machine to solve an optimization problem by assigning weights to all attributes to produce the closest mirror of the human scored results. As humans we review how the machine would weight the attributes to match our human scoring and we learn and adjust.

In all phases of our data model, The Bird Brain, it is ultimately a human who makes a judgment call about what should be included and how things should be weighted. The human judgment, however, is aided in every phase by a computer that excels at all the things that are most challenging to a person. The net result is a superior solution and continuous improvement that’s never been so simple or effective.

The early bird gets the word

Everyone knows it’s good to save the best for last. Here goes:
Context is everything in content, and every industry has specific phrases that are meaningful in that context. “Ate my card” is a phrase which is mostly meaningless, unless you’re a bank. “Signal nowhere” is important for telecom, but probably not for others. The Bird Brain starts smart, but gets smarter as users add these phrases to their specific use case.

This is Earlybirdy. We’re excited for you to get to know her.

*As an aside, submit the comcast tweet to the Stanford Open Source NLP stack and you’ll be told that this is “positive.” Not only is that wrong, but what is a business supposed to do with that?

Read more by me at blog.earlybirdy.com

Like what you read? Give Brent Eyler a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.