Dependency parsers in Natural Language Processing (NLP)
Dependency parsers are tools that allow us to analyze sentences, with particular focus on their grammatical structure.
Dependency parsers pass each sentence into a set of rules, which then allows us to determine the dependencies that build and create that sentence.
Dependency parsers show how “head” words depend on other words and how those other words change or modify the head words.
To illustrate this let us use the open source library spacy. Spacy is great for many NLP tasks and it has an excellent dependency parser (another great tool for this task is from Stanford NLP).
We have used spacy for many different purposes, including producing list of products categories and for website categorization api (see https://hub.docker.com/r/categorizations/websitecategorizationapi).
Let us take the sentence “They are driving fast” and apply dependency parser on it, it results in the following image:
How to interpret this diagram?
The outgoing arrows mean that “fast” modifies the word “driving” where their relation is defined by special tags, in this case this tag is advmod.
Some of other dependency parser modifiers are:
- acl: clausal modifier of noun (adjectival clause)
- advcl: adverbial clause modifier
- amod: adjectival modifier
- appos: appositional modifier
Note that the same sentence can lead to different potential parse trees. Syntactic disambiguation is the selection of correct parse tree from different possibilities.
You can generate the above diagram with the following code:
import spacy
from spacy import displacy
nlp = spacy.load(“en_core_web_sm”)
doc = nlp(“They are driving fast.”)
displacy.serve(doc, style=”dep”)
You can make this kind of analysis by using online tools for dependency parsing.
You can pass the above text to this tool:
obtaining a bit nicer diagram, ready for publication:
Where do we need dependency parsing?
Dependency parsing use cases are numerous:
- grammar checking
- information extraction
- url categorization
- aspect based sentiment analysis
- web content filtering
- named entity recognition
- question answering
- website categorization
- document summarization
- ocr image to text extraction
- ai lead generation tools
- analysis of website technologies
We have e.g. built special dependency parser for the aspect based sentiment analysis project.
Aspect-Based Sentiment Analysis (ABSA) is a special type of sentiment analysis that segments opinions by aspect or feature and then proceeds to determine the sentiment with respect to that aspect.
A great source of data set for ABSA training are within SemEval project, e.g. we used this in the past https://alt.qcri.org/semeval2014/task4/
The latest tasks (for 2022) are available at https://semeval.github.io/SemEval2022/.