Let’s make a Spam Classifier using Naive Bayes algorithm | NLP Part-1

Rahul Sood
Batteries Included
Published in
2 min readApr 13, 2020

--

We will use Natural Language processing and learn how to make a Spam classifier using Natural language processing, let’s start creating it…

When you reach the midpoint of this small article, you should have created a spam classifier. And the best part is, you won’t be installing anything. Let me tell you what my plan is, first I will provide you very basic information, then we will compile the code and you will be able to see the results by yourself. After that, we will learn the details of the Naive Bayes theorem. The idea is to first provide you the satisfaction of completing something and then dig your knowledge deeper. Sounds good? Let‘s start …

I assume you have some knowledge of python programming if not, you can still learn it on the way. I have created a notebook on Kaggle, please click here and ‘edit’ the notebook to run the code. One more thing, YOU MIGHT NEED TO LOGIN TO to run this code on kaggle. Once you have the results, please come back here to learn it more in detail.

Two main algorithms are working behind the scenes,

  1. Bag of words: This algorithm converts the text into a frequency matrix. Which is a matrix showing how many times a particular word repeated itself in the text? The best practice is to implement it by yourself. for that, I have created another notebook on Kaggle which can be found here. Please try it yourself.
Bag of words algorithm

2. Naive Bayes algorithm: In my next article I will explain this algorithm. For now, I want to congratulate you for completing this tutorial.

I hope you have run the code by yourself. If not, you can still click here to run the code. It just takes some ctrl+enter commands to run the notebook but you will have the satisfaction of accomplishment (reward) which will motivate you to learn further.

I try not only to teach but also to keep my readers motivated. Hope you have enjoyed the tutorial, I will continue this NLP series further. See you in my next article…

--

--