Advanced NLP(natural language proccesing) with Spacy

Gani Çalışkan
Turk Telekom Bulut Teknolojileri
3 min readMay 31, 2023

What is Spacy?

Spacy is an open-source software library for NLP written in the programming languages Python and Cython. The library is published under the MIT license and its main developers are Matthew Honnibal and Ines Montani, the founders of the software company Explosion.

1) Getting Started

Let’s do some exercises of different languages by using Python.

a) English

b) Spanish

c) German

2) Documents, spans and tokens

In this exercise, we are going to tokenize the text and create a document object. Therefore, I displayed the 4th token of the sentence.

a) Step 1

b) Step 2

In this step, I’m going to display multiple words from sentence by using doc command with series.

3) Lexical Attributes

In this example, we will use spacy by using lexical attributes to find percentages in a text.

4) Loading Pipelines

5) Predicting Linguistic Annotations

Part 1

In this exercise, I’m going to tag each words of the text from previous exercise and determine every part of the speech. For example: Adjectives, nouns, verbs, pronouns, proper nouns, numbers, punctuations etc.

That output shows all the tags of each words of the sentence.

Part 2

6) Predicting named entities in context

We try to predict release date of iphone 15 with training data.

7) Using Matcher

8) Writing Match Patterns

Part 1

Part 2

Now let’s try to match words from the sentence which I included my favourite games :). Detecting “download” word is the key point. Spacy will catch which of these programs have been downloaded, are still downloading or didn’t download yet.

That is the end of the article. Hope you enjoy :)))))

--

--