The Unreasonable Effectiveness of Meaning

John Ball
Pat Inc
Published in
15 min readAug 6, 2019

--

Meaning hits the target missed by data: generalization and understanding

A decade ago, Google’s scientists published “The Unreasonable Effectiveness of Data[i].” It focused on “natural-language-related machine learning” successes like statistical speech recognition and statistical machine translation that use large amounts of data. It preceded the best successes of deep learning which took place a few years later, around 2012, improving on statistical systems.

Today, I will look at a likely consequence of that paper’s recommendations: the divergence away from brain and language theory to data science that has led us to today’s gap between user expectations and system performance in conversational AI. I’ll look at applying data to natural language from the best results down to its known problems to conclude that data has had its day for natural languages. The alternative, the use of meaning, can be seen as the key to natural language processing (NLP), not data. Natural Language Understanding (NLU) is about meaning, not keywords, nor intents.

The researcher’s call to action was ominous:

“For natural language applications, trust that human language has already evolved words for important concepts. See how far you can go by tying together the words that are already there, rather than by inventing new concepts with clusters of words. Now go out and gather some data, and see what it can do.”

--

--

John Ball
Pat Inc

I'm a cognitive scientist working on NLU (Natural Language Understanding) systems based on RRG (Role and Reference Grammar). A mouthful, I know!