Word Sense Disambiguation
A story of how ambiguous word meanings caused an AI Winter.
The history and development of Artificial Intelligence has seen numerous peaks and troughs. Hype around what machines can accomplish lead to boosts in AI funding while unmet expectations cripple the industry until the next breakthrough. The term AI Winter refers to periods in history of reduced funding and interest in artificial intelligence development.
Machine Translation and the Cold War
During the cold war, there was an increased interest in Machine Translation to automate the translation of Russian documents into English. This time period also coincided with massive strides in linguistic developments and the early career of the famed linguist Noam Chomsky.
In addition to his research on grammar understanding, Noam Chomsky was a consultant to several military backed NLP projects. The looming fear of nuclear war combined with our new found understanding of grammar was the perfect beginning of an AI hype.
Common Sense Knowledge Problem
Humans often take common sense for granted. Our brains are so well wired that we seldom consider the complexities of our simplest inferences. This was a big problem in the 1960s where we expected the same from machine translation.
Today, this type of prior information is typically instilled in AI development through massive amounts of information. If we develop our word embeddings over the entire internet corpus, then our AI is much better positioned to understand common knowledge such that lemons are sour, friendship is good and that mom is another word for mother.
Word Sense Disambiguation
The complexity of word sense disambiguation was underestimated by the researchers. Most words carry multiple meaning. The meaning often times had to be inferred by the surrounding word or by the context of the document. The (relatively) simple translation systems of that time was not able to do that.
The word pen for instance could refer to:
- a writing utensil
- an enclosure for livestock
- a female swan
… amongst other things.
This New York Times article from the 1980s mentions one such gaffe from the early days Machine Translation.
the spirit is willing, but the flesh is weak
translated to Russian and back into English becomes:
the vodka is good, but the meat is rotten
… which completely construes its meaning.
(Temporary) End of Machine Translation
Interest in Machine Translation began to die after failure to solve the word sense disambiguation and other NLP problems.
In 1966, the Automatic Language Processing Advisory Committee published a report that concluded Machine Translation was slower and less accurate than human translation. The report recommended that future research should focus on speeding up human translation instead.
Research for Machine Translation ended (for the time being) after the National Research Council withdrew all funding.
Machine Translation Today
In past decade, we made massive strides in deep learning, data collection and feature representation. As a result, Machine Translation has become an area of active development — Google, Microsoft and several other large tech companies are actively developing their translation systems.
Google Neural Machine Translation, a deep learning approach to translation, has been implemented on over a hundred languages supported by Google Translate.
With the current speed of AI development, the future holds nothing but excitement for Machine Translation.