Ambiguous sentences in NLP

Vaibhav Tiwari
Analytics Vidhya
Published in
3 min readSep 28, 2020

“Language is a city to the building of which every human being brought a stone.” ~Emerson

Anomaly detection: Unsplash

One of the basic reasons behind the substantial growth of mankind is the evolution of Language and communication medium. We usually take language for granted and unconsciously overlook the deep understanding that comes with the good knowledge of a language. In some manner, it impacts our thinking and can reshape it as well.

A few days back, I had one brilliant interview for the role of NLP engineer internship that covered some linguistic exercises too. One of them challenged to tell the ambiguities in the given sentences and the problems it can create in NLP. Not until that time I had discovered the importance of precisely understanding the language we use.

We generally use NER, POS tagging and Dependency Parsing during the text preprocessing but sometimes they miss few sentences in assigning correct labels. These sentences could be ambiguous in nature and here, I would explain my research on ambiguous sentences that I did after the interview.

What does it mean to have an ambiguous sentence?

A sentence, word or phrase is said to be ambiguous when its meaning cannot be interpreted to a certain exactness and generally has more than one meaning. These ambiguous sentences create problems in Human understanding, let alone computers that require spoon-feeding.

Types of ambiguities:

Continuing my research on the ambiguities, I found there are many exciting forms that are worth mentioning:

  • General ambiguities (multi-meaning sentences)

In the example “Stolen painting found by the tree”, a person unknown to the human customs can get it in two ways, either the painting was lying near the tree or the tree found the painting (read the example again). The same could happen with an NLP model if these kind of ambiguities are not addressed separately.

  • Attachment ambiguities

If the belongingness of any mentioned entity is uncertain, then attachment ambiguity occurs. “She saw the man with the telescope”, presents attachment ambiguity since it is not clear as to which person does the telescope belong. Telescope can be lying with the man or it could be in use by the lady to see the man. Either is possible and thus the sentence is ambiguous in nature.

In the below picture, given only first line, one can get confused whether the elephant was in pajamas or the man (though we all know the reality). Since it is a famous joke, the author cleared it by the second line :)

Attachment ambiguity
  • Verb based ambiguities

“Each of us saw her duck”, shows the verb based ambiguity since one cannot differentiate whether the female was doing the action of ducking or was showing a real ‘duck’. Whichever is true, neither can be fully explained here.

  • Coreference ambiguities

As the name suggests, generally the pronouns used in the sentences cause the problems in the understanding. For eg. “My sister and I met my lawyer for a coffee but she became ill and had to leave.” Clearly it is confusing to understand whether the sister got ill or the lawyer and that leaves us with the coreference ambiguity.

Coreference ambiguity
  • One last discussion would be around the framing and expression of the sentences. Sometimes there are many ways to say the exact same sentence and thus is again a problem for a dumb machine. For eg. “She gave the book to Adam” or “She handed Adam the book” are convincingly understandable to humans but are two different sentences for machines.

The discussion is endless and that’s the beauty!

References: https://literarydevices.net/ambiguity/ , CS295 course , https://literaryterms.net/ambiguity/

--

--

Vaibhav Tiwari
Analytics Vidhya

Data Science Intern at Trell | IIIT Jabalpur | Enjoy to write on ML and Abstract topics