Is Word Sense Disambiguation outdated?

How Target Sense Verification improves domain-specific and enterprise disambiguation settings

Photo by Hu Chen on Unsplash
  • the shortcomings of current Word Sense Disambiguation (WSD) and Entity Disambiguation (ED) task formulations
  • how Target Sense Verification (TSV) can improve this situation
  • a real-world setting of the TSV task
  • how a disambiguation model solving this task could look like

Ambiguous Words

Graphic by Will Heltsley on Wikipedia
Gif by sttiiaann on Imgur; Bonus: https://i.redd.it/o5eaiwv7jd711.jpg
Graphic borrowed from Camacho-Collados, José & Pilevar, Mohammad Taher. (2018). From Word To Sense Embeddings: A Survey on Vector Representations of Meaning. Journal of Artificial Intelligence Research.

Word Sense Disambiguation and Entity Disambiguation

An example of Word Sense Disambiguation against WordNet
An example of Entity Disambiguation against DBpedia senses of “Apple”: Apple, Apple, Apple, and Apple :)

Therefore, the current disambiguation task formulations are not suitable for many domain-specific and enterprise settings.

All the letters in ALPHABET ….

We are still missing the correct sense
How many Alphabets do you know?

To summarise:

  • Current disambiguation task formulations aim at finding the most suitable sense of a word in a given sentence.
  • This formulation requires systems to model the entire sense inventory, which makes the system inflexible and assumes the availability of all senses, which is not always given.
  • Therefore, these task formulations are not suitable for many modern domain-specific and enterprise settings.

But how can the disambiguation task formulation be improved?

Target Sense Verification to the rescue

Illustration of Target Sense Verification (TSV) task formulation
  • Existing enterprise and domain-specific senses can be easily used as target senses for this task, regardless of their current representation in the use case environment. As sense indicators are quite generic, they usually can be easily generated from all kinds of knowledge representations.
  • When creating a resource for your domain-specific senses, there is no need to take care of out-of-domain senses. Just focus on the things you are interested in!
  • Pre-training and domain adaptation can be more easily exploited, as the requirements within your domain are minimal.

A real-world Use Case!

The Data

The training set of WiC-TSV only consists of general domain instances, the test set contains domain-specific instances from computer science (CPS), cocktail (CTL), and medical domain (MSH).
Some examples from the WiC-TSV benchmark dataset.

The Model

Architecture of our disambiguation model

The Performance

Want to know how good you are at Target Sense Verification? Try it out: https://www.surveymonkey.com/r/LHYWXPV

Closing and Further reading

--

--

News from the world of graphs, semantic web technologies, and Semantic AI

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anna Breit

NLP and Graph ML Researcher, Computer Scientist, and Puzzle Enthusiast