Naive Bayes — A Simple but Powerful Classifier

Published in

Red Buffer

4 min readJul 9, 2021

Naive Bayes is a probabilistic machine learning algorithm based on the Bayes Theorem. This article aims to provide you an introduction to the Naive Bayes Classifier, types of Naive Bayes Classifier, and how it can be helpful while solving certain classification problems.

Naive Bayes is one of the simplest machine learning algorithms which can be effectively used in classification problems. Despite the advances in Machine Learning over the past few years, Naive Bayes has proven not to be just simple but fast and reliable too.

Let’s dig deep and understand how does Naive Bayes can help us in solving classification problems.

Principle of Naive Bayes Classifier

As stated above, Naive Bayes is a probabilistic machine learning algorithm so it is totally dependent upon the Bayes Theorem which is as follows:

Let’s understand what are these terms which compose the equation:

P(B|A) — Likelihood

P(B) — Evidence

P(A) — Prior Probability

P(A|B) — The probability of occurring A knowing the evidence B, also known as posterior probability.

Now that we have some knowledge about Naive Bayes, let’s take a step further and look into an example to gain a more in-depth knowledge of this algorithm.

Example

Let’s consider that you and your friend have planned to go out for a picnic. As a precautionary step, you check the weather report and the forecast is for a sunny day but when you look out the window there are dark clouds gathering. Now, it might rain and your picnic will be ruined or it might be a sunny day and your picnic will not be ruined. Let’s ask help from Naive Bayes to make the decision for us…

In order to make the right decision, we should gather information about cloudy days. Here is what we think about cloudy days…

50% of rainy days start off cloudy

40% of days start cloudy

10% of days in July are rainy

According to the Bayes theorem:

P (Rain | Cloud) = (P(Rain) * P(Cloud|Rain)) / P(Cloud)

P (Rain | Cloud) = (0.1 * 0.5) / 0.4

P (Rain | Cloud) = 0.125

From the above calculation, it seems that there is only a 12.5% chance of rain. So your decision should be to call your friend and go for a picnic!

Why Naive Bayes is called ‘Naive’?

While studying about Naive Bayes a question arises in everyone’s mind that why it is called Naive? Naive Bayes assumes that each input variable is independent which makes this a strong assumption and is unrealistic for real data. This is the reason why Naive Bayes is called Naive.

Now that you have understood Naive Bayes more clearly, it’s time that we talk about its types.

Types of Naive Bayes Classifier

There are three types of Naive Bayes Classifier

Multinomial Naive Bayes

The Multinomial Naive Bayes is mostly used for document classifications. The features used by the classifier are the frequency of the words present in the document.

Bernoulli Naive Bayes

The Bernoulli Naive Bayes is almost the same as Multinomial Naive Bayes but it is useful if your feature vectors are binary (i.e. 0s & 1s).

Gaussian Naive Bayes

The Gaussian Naive Bayes assumes that features follow a normal distribution.

Conclusion

The Naive Bayes algorithm is one of the supervised machine learning algorithms which is mainly used in spam filtering, sentiment analysis, etc. They are fast and easy to implement but the main limitation of Naive Bayes is the assumption of independent predictor features.

That’s all for now! Feel free to put a comment below if you have any suggestions or questions.

References

https://medium.com/@superalbert/bayes-theorem-c7a46734b98d