SpaCy Library Cheatsheet
3 min readMay 30, 2024
- Library for Natural Language Processing
- pre-trained statistical models and word vectors
- Convolutional neural network models for tagging, parsing and named
entity recognition - Interacts well with Deep Learning Libraries
1-Sentence Segmentation
import spacy
# Load English Model
nlp = spacy.load('en')
text = "Twenty-two years after the original Jurassic Park failed, the new park,also known as Jurassic World, is open for business. After years of studying genetics, the scientists on the park genetically engineer a new breed of dinosaur, the Indominus Rex."
# Run SPaCy pipeline
sp_text = nlp(text)
# Segment into sentences
for sentence in sp_text.sents:
print(sentence)
2- Tokenizing
import spacy
# Load English Model
nlp = spacy.load('en')
text = "Twenty-two years after the original Jurassic Park failed, the new park,also known as Jurassic World, is open for business. After years of studying genetics, the scientists on the park genetically engineer a new breed of dinosaur, the Indominus Rex."
# Run SpaCy pipeline
sp_text = nlp(text)
# Get tokens
for word in sp_text:
print(word.text)