Stop Words in NLP

All about stop words in Natural language processing along with hands-on examples.

Image for post
Image for post
Photo by Jose Aragones on Unsplash

“stop words” usually refers to the most common words in a language. There is no universal list of “stop words” that is used by all NLP tools in common.

What are stop words?

When to remove stop words?

Pros and Cons:

How to remove stop words in python using:

using NLTK to remove stop words
Image for post
Image for post
tokenized vector with and without stop words
Image for post
Image for post
List of 179 NLTK stop words
$ pip install -U spacy
$ python -m spacy download en_core_web_sm
using spaCy to remove stop words
Image for post
Image for post
tokenized vector with and without stop words
Image for post
Image for post
List of 326 spaCy stop words
using gensim to remove stop words
Image for post
Image for post
tokenized vector with and without stop words
Image for post
Image for post
List of 337 gensim stop words
custom stop words list
Example:
my_stopword_list = [‘the, ‘is’, ‘as’, ‘a’, ‘are’, ‘in’, ‘this’, ‘that’]

Conclusion:

Written by

I like to Deep Learn.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store