Is Understanding Language Easy as Apple Pie?

“If you wish to make apple pie from scratch, you must first create the universe.” — Carl Sagan

Thus far, language-based AI systems either attempt to build a sci-fi-like humanness from the ground up, or abandon the understanding of language altogether and use a series of grunts based in the stone age. There are 3 distinct development paths:

1. Traditional Chat Bots

Traditional chat bots do simple stuff very well, but there’s no machine learning to speak of. This is AI that is artificial without the intelligence. It can closely approximate a human from the Neolithic era. Responds to your requests in binary form: one grunt for yes, two for no. Or maybe it’s the other way around. This bot can’t remember which grunt is which because it can’t remember anything. It is blessed with a goldfish memory and is blissfully ignorant of all that it doesn’t know (which is a lot).

The biggest issue is that most chatbots require businesses to hard-code knowledge into the software, such as SKUs, store locations, hours, and more. Any complex business has millions of SKUs and data silos across departments so hard-coding for chat-bots is simply untenable and certainly not scalable. On the plus side, it is a market-ready product, cheap to deploy once you’ve integrated all the necessary data, and works well for ordering pizza with emoji’s.

2. Research AI

Adopting Carl Sagan’s famous apple pie quote (above): This is language built from atoms — words, sentence structure, designed to pass the Turing Test. Usually associated with university research, it’s fascinating stuff indeed, but the timeline for developing an actual product is still years away. This AI is based on the machine learning techniques called Deep Learning. While neural networks generally combine many simple neurons to create complex behavior and memory, Deep Learning stacks many layers of artificial neurons to allow for more complex behaviors. Deep Learning have proven far more successful in image recognition than they have in natural language processing. To this point, Deep Learning has been extremely successful in signal processing where the input signals can be mapped easily to an image, such as audio and video. A signal at one point is likely to inform the next point, creating patterns which machines can learn. However, unlike audio and video processing, whose outputs are either frequency and amplitude or RGB values, natural language has a much larger set of possible outputs (all possible words, across all possible meanings). That’s why without contextual knowledge necessary to “make sense” of any given sentence, computers struggle determining the difference between flies and like in the following two sentences:

Time flies like an arrow. 
Fruit flies like bananas.

When looking at words and sentences, the devil is not in the detail, but in a context awareness which usually only comes from experience. Deep Learning approaches, while making great strides, haven’t been able to process all of the nuances of language.

3. Practical AI

AI that’s often described as Practical AI has most of the benefits Deep Learning (sophistication, adaptation, scale) but is also market ready. By combining supervised learning with clever software design, businesses can train a machine to automate processes in minutes that previously would have taken weeks, months, or even years. Practical AI makes this possible by abstracting away the complex elements of conversational AI, such as directed acyclic graphs, enabling you to write straightforward, step-by-step process guides that look very like what you’d write to guide a human today, allowing the AI to fill in the gaps. Put simply, Practical AI focuses on how to make a great apple pie from apples (not atoms) and how to serve it to the customer in the most tasty and efficient way. In the end, the customer doesn’t care that the apple was originally stardust from the Big Bang, they care about the experience of eating it.