Naïve Bayes Theorem

Understanding Naïve Bayes Theorem

6 min readDec 22, 2022

Naïve Bayes Theorem

Naïve Bayes Theorem is a classification technique based on Bayes’ Theorem with an assumption of independence among predictors. In simple terms, a Naïve Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.

For example, a fruit may be considered to be an Apple if it is red, round, and about 3 inches in diameter. Even if these features depend on each other or upon the existence of the other features, all of these properties independently contribute to the probability that this fruit is an Apple and that is why it is known as ‘Naïve’.

Let’s understand some of the key concepts related to Naïve Bayes Theorem.

Conditional Probability

Conditional Probability is defined as the likelihood of an event or outcome occurring, based on the occurrence of a previous event or outcome. Conditional probability is calculated by multiplying the probability of the preceding event by the updated probability of the succeeding, or conditional, event.

Similarly,

Conditional Probability

Bayes’ Theorem

Bayes’ Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. Bayes’ Theorem provides a way to revise existing predictions or theories (update probabilities) given new or additional evidence.

From the conditional probability we know,

Also, per the law of probability, the probability of events A and B both occurring is the same as the probability of B and A both occurring.

Probability of Events

So,

Probability of Events

Therefore,

Mathematics

Let’s assume we have a dataset as described here,

Per the Bayes’ Theorem we get,

Now, the output of the equation below will be constant for any outcome i.e. will always give same value.

Constant

So we can say,

Finally,

Outcome

What this means is that the output (y) will be the value of the highest outcome of the equation.

Example

Let’s understand the concept with an example.

Binary Classification

Here we have a training dataset of Outlook, Temperature and corresponding target variable ‘Play’ (suggesting possibilities of playing). Now, we need to classify whether players will play or not based on weather condition.

Dataset

Problem: Will Players play Today if the Outlook is Sunny and Temperature is Hot?

We can solve this problem using method of posterior probability.

Today

We can ready Today as a record with feature Outlook as Sunny and Temperature as Hot.

Today

Playing

As per the Bayes’ Theorem,

Bayes’ Theorem

Per Naïve Bayes Theorem

After plugging values we get,

Not Playing

As per the Bayes’ Theorem,

Bayes’ Theorem

Per Naïve Bayes Theorem

After plugging values we get,

Now, if we notice the sum of the probabilities above are not equal to 1.

Sum of Probabilities

We want sum of our probabilities to equal to 1. To do so we can simply normalize our probabilities.

Probability of Yes

And,

Probability of No

Since the probability of No is higher than the probability of Yes we can conclude the the players will not play today.

We don’t necessarily need to normalize the probability as we can infer directly from the original probability whet the output is Yes (0.031) or No (0.085).

Text Classification

Naïve Bayes is also very useful in NLP. Specially for text classification or sentiment analysis. We slightly modify the equation for conditional probability for text classification problem.