Naive Baye’s Algorithm

Published in

ML_with_Arpit_Pathak

4 min readJun 20, 2020

Hello readers , this blog explains about a machine learning algorithm called Naive Byes . We will deep dive into the basic concepts of Naive byes and understand how it works . This algorithm of machine learning considers some of the basic principles of probability for making predictions .

Basic Terminologies

Before going for the Naive Byes Algorithm , let us first understand some of the terminologies related to probability . For this , let us consider an example of a pack of cards which has 52 cards .

1) Independent Events

Let us consider we have two events —

Drawing a Queen card from the pack of 52 cards .
Draw a King card from the pack of 52 cards .

Now , let us try to find out the probability of these two events —

These two events do not rely on one other so they are not correlated and hence are called independent events .

2) Dependent Events

Now let us consider 2 more events as follows —

Drawing a Queen card from the pack of 52 cards .
Drawing another Queen card from the remaining pack of 51 cards .

Now , let us try to find the probability of these two events —

We can see that in above two cases , the second event is dependent on the first event . So , the second event is known as dependent event .

3) Conditional Probability

Now , if we consider the above Dependent Event 2 then we can present it in the form —

>>>>>>>>>>>>> P (event2 | event1) = 3/51 <<<<<<<<<<<<<<<<<

This says that the probability of Event 2 given that Event 1 has already occured is 3/51 . This is known as the conditional probability . Let us see the formula for conditional probability —

4) Baye’s Theorem

This is the Theorem that is used in the Naive Baye’s Algorithm for training and predictions .

Naive Baye’s Algorithm

Naive Baye’s is a supervised machine learning algorithm that works on the principle of Baye’s Theorem and is used for classification based learning and predictions . The most important assumption for Naive Baye’s is that any feature in the dataset is completely unrelated to any other feature in that dataset .

Let us try to understand the working of Naive Baye’s theorem from an example mathematically .

Let us consider a dataset in which we have 2 independent features Xa and Xb and one dependent feature Y . This Y is the output of out model and it has a binary output of Yes or No only . The dataset has ‘p’ records in all .

The probability of “Yes” or “No” in the Y is called “Prior” .

P (Yes) = (no. of Yes) / p

P (No) = (no. of No) / p

The Bayes Theorem for this dataset can be said as —

This is the Baye’s Theorem to be used in this dataset for training it over the data and doing predictions .

Now , since there are two outputs for Y i.e “Yes” and “No” , hence the Naive Baye’s algorithm finds the probability for both types of output as follows —