Introduction to Natural Language Understanding

How we imagine talking to computers (Luke Skywalker and C3PO) versus How we actually talk to computers with a guy speaking with a smartphone (Siri, Google Voice, Alexa and Cortana)

There is a big gap between our expectations and the reality concerning Artificial Intelligence. We can put the blame on Natural Language. Which is more difficult to master than we tend to believe.

However, the computer still represents a docile and powerful assistant and we want it to be smarter. A large number of people are pushing forward the research thanks to worldwide challenges or competitions, for example the Allen AI Science challenge (which aims to prove that an AI can be smarter than an 8th grader).

One of the most famous examples of a realistic intelligence came from the movie director Stanley Kubrick in his 1968 movie 2001: A space Odyssey.

How far are we from a fictional assistant similar to HAL-9000, capable of helping David Bowman in his mission? How far are we from this fictional assistant?

Let’s compare HAL-9000 with SIRI (from the dialogue with Stephen Colbert in 2011 on the Colbert report, an American TV show).

HAL — 9000 versus Siri (The Colbert Report)

HAL-9000 gives the feeling that it fully grasps the situation in its complexity, whereas Siri doesn’t even understand a simple request… :-)

In Computer Science, we call the process of “understanding” the meaning behind a sentence as Natural Language Understanding (NLU).

Ex: Should I take an umbrella today? => Will it rain today?

It is different from Natural Language Processing (NLP) which is the process of determining the grammatical role of every word in a sentence and their relations.

Ex: The/DT man/NN who/WP gave/VBD Bill/NNP the/DT money/NN

Let’s start with a brief history of NLU and we will see afterwards what are the main problems related to this field of Artificial Intelligence.

A brief history of NLU

1950s: Beginning of NLU.

Turing addressed the problem of artificial intelligence, and proposed an experiment which became known as the Turing test, an attempt to define a standard for a machine to be called “intelligent”.

At the beginning, developers evaluated a user’s input with a few rules of pattern-matching.

Example: if “Hello <-VARIABLE->” then greetings.

1970–80s: Linguists started to “code”.

Linguistics experts started to contribute to NLU, by “coding” all grammar and semantic rules. That produced realistic software like:


(Winograd 1972)


(Pereira 1980)

Both were linguistically rich and logic-driven.

We can be more critical and say that the questions came from a sandbox of easy questions, but it was 35 years ago.

One of the biggest problems at the time was the grammatical interpretation of a sentence (NLP). The error rate was important.

1990–2015: Statistical revolution in Natural Language Processing.

The statistical revolution in Natural Language Processing led to a decrease in the NLU research:

The majority of the models in NLP now include what is called today “Machine Learning”. It is a probability model. The more you give data, the more efficient the model is. Today, results are pretty amazing: we can process a sentence with more than 98% of accuracy.

What are the main problems of NLU?

First of all, NLU is an ungrateful field, we have to admit it: we are very demanding when it comes to computers’ understanding and knowledge.

Do we really need today a personal robot that we can have a philosophical discussion with? Or do we just need to automate daily tasks, like creating a shopping list?

Technically, there are two main problems:

We have multiple ways to express a same idea.

Example: When you want to make an appointment with your doctor, you may say:

● I need to make an appointment.

● I need to see the doctor.

● When is the doctor free?

● I need to renew my prescription.

● Do you think the doctor could squeeze me in today?

● I need to make an appointment for my husband.

● My child needs to come in for a check-up.

● The doctor wants to see me again in two week’s time.

● …

To ask for a “rendez-vous”, you can do it in multiple ways.

In order to understand the whole sentence, we have to link together a lot of concepts by creating associations between words. (prescription <=> doctor <=> cold <=> check-up)

All these words lead us to the second main problem.

Words and sentences are context dependents.

But first of all, we need to define what the context is: we can say that it is something that helps us to understand of something else, be it a text, a joke, an event…

In other words: context is the circumstances of something happening.

It can be a story lived by a two persons from a group of ten (private joke) which may create a specific meaning for both of them, different from the one understood by the rest of the group.

It can also depend on a situation.

Let’s take an example: if you read somewhere “… and bacon”, what is the meaning of these two words?

We begin with the first word “and”, it defines the end of a list; regarding the second word “bacon”, it is a meat product.

Does it imply ordering something? Does it imply listing all pork recipes? Does it imply completing a shopping list?

We cannot guess the point of such sentence without context. This is exactly what we expect a computer to do.

Actually, I think we are not approaching the problem from the right way.

Even a human cannot understand what is the meaning behind random words without a context, and the only one who can give enough data when he is talking to the computer is the user, it cannot be only based on “probabilistic model”.

We have to find a way to help developers to add more intelligence in their softwares, and to do so, everybody has to contribute to Artificial Intelligence.

Together, we will crack Natural Language Understanding and build a better Artificial Intelligence !

Gaëtan JUVIN — Recast.AI / @RecastAI

(This post was originally published at

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.