Syntactic Parsing practices in NLP: Constituency and Dependency Parsing

Published in

Plain Simple Software

4 min readOct 10, 2022

Let’s begin with the term “parsing”. When given a sentence or other text data, the task of performing syntactic analysis and creating a parse tree from the data is called parsing. This tree highlights the syntactical structure of a sentence based on formal rules in grammar. Basically, it’s breaking up sentences into parts of speech like nouns, verbs, etc. In Natural Language Processing, we need to know the relationship between sentences and words. We need to understand how the linguistic network for the piece of data is constructed. The two most commonly referred to parsing types are Dependency Parsing and Constituency Parsing. This article will explore a bit of both. Python is such a groovy programming language that has some great libraries built just for Natural Language Processing. One of my favorites is the Stanza NLP library. I’m going to be using this library for the examples in this article.

Dependency Parsing

Which words depend on other words in our text data? What word is the “head” word in charge, so to speak? These are the questions answered by dependency parsing. This is the task of extracting a grammatical structure that clearly defines the relationships between “head” words and all other words. Thinking of it like a graph, the words are nodes and the dependencies between them are edges. Some specifically call this a Dependency Structure. In this structure, there is a defined root word. All other words can be reached (are dependent to) through this word as a starting point of the text data. In the following example, we review the phrase “I’m gonna make him an offer he can’t refuse”, a popular line from the Godfather. Here we see that the Stanza NLP library has used its DepparseProcessor to accurately label the dependencies structure and parts of speech.

The root word in this sentence is “make”. That is the main idea behind the entire sentence. It is labeled as THREAD ID: 0. It has a DEPREL (dependency relationship) to itself of root. Its ID is 4 because it is counted as the 4th token in the text data. All other words are connected to this word by a dependency arc. For instance, the word “can’t” is dependent on the word “refuse” as seen here by the THREAD ID: 10 and THREAD: refuse. Next, the word “refuse” is dependent on the word “offer”. Which in turn is dependent on the word “make”, which is the root word. The actual tagging of the word is really an important feature. It is how the word modifies the meaning of the root word. For instance, the words “I” and “he” are labeled as noun subjects of the root word “make”. In addition, the word “gonna” is accurately labeled as an adjective modifier as it is indeed an adjective clause used as a modifier.

Constituency Parsing

Where Dependency Parsing is based on dependency grammar, Constituency Parsing is based on context-free grammar. This type of parsing deals with the types of phrases in the text data. Constituency Parsing breaks text into sub-phrases, constituents, based on a grammar category. They are basically their own unit of grammar. Some categories for the phrase units of grammar are noun phrase (NP), verb phrase (VP) or sometimes prepositional phrase (PP). Using the Stanza NLP library’s ConstituencyParser, we can see the following output for a famous A Few Good Men movie-phrase.

Here the ConstituencyParser has separated this sentence into two phrases. “You” has been labeled a noun phrase. The rest of the sentence, “can’t handle the truth”, has been labeled a verb phase. It further goes on to tag the individual words. The word “you” is tagged as a proper noun. The word “handle” is tagged as a verb. Even the word “the” is properly tagged as a determiner.

Brief Summary

In this article we explored two main parsing processes, Dependency Parsing and Constituency Parsing, and their differences. If you enjoyed this brief tutorial, please follow me, Z. Myricks, for more guides in all things Python and Natural language Processing. Make sure to follow Plain Simple Software for more software articles!

Syntactic Parsing practices in NLP: Constituency and Dependency Parsing

Written by Skull and Zen™️