Data Science (Python) :: Naive Bayes

2 min readJul 23, 2017

Intention of this post is to give a quick refresher (thus, it’s assumed that you are already familiar with the stuff) of concept of “Naive Bayes” (using Python). You can treat this as FAQ’s as well.

What’s Bayes Theorem?
It’s a theorem which uses probability concepts to arrive at a prediction for an outcome based on already available pre-conditions for different outcomes.
For example, Let’s say there are 2 machines (M1 & M2). M1 produces 20 bulbs per hour. M2 produces 30 bulbs per hour. M1’s production contains 3% defective pieces and M2’s production contains 4% defective pieces. With this information in hand, we can apply Bayes theorem, we can actually calculate, “What’s the probability of picking a bulb from a lot of defective pieces if from M2”
Probability of picking a bulb which is from machine 1 = P(M1) = 40% = 0.4
Probability of picking a bulb which is from machine 2 = P(M2) = 60% = 0.6
Let B denote the event that a randomly chosen item is defective.
If a bulb is from M1, then the probability of the bulb being defective = P(B | M1) = 3% = 0.02
If a bulb is from M2, then the probability of the bulb being defective = P(B | M2) = 4% = 0.04
Probability of picking up a bulb which is defective = P(B) = 3% + 4% = 7% = 0.07 (Thus, 7% of total production is defective)
Now, we want to calculate, given a lot of defective pieces, what’s the probability of a random selection being from M2?
P(M2 | B) = [ P(B | M2) * P(M2) ] / P(B) = [0.04 * 0.6 ] / 0.07 = 0.34 = 34%
So, there is a 34% chance that we pick a bulb which is produced from M2, when we pick a random item from a given lot of defective pieces!
Thant’s Bayes Theorem for you. Interesting right!?!

****************************************

Why is it call naive?
Because it’s based on independence assumptions. That means, we are assuming something for predicting something and thus, a naive approach!

****************************************

Sample code for implementing Naive Bayes classifier?
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(X_train, y_train)

****************************************

Next :- Data Science (Python) :: Decision Tree Classification & Random Forest Classification

Prev :- Data Science (Python) :: Kernel SVM (Kernel Support Vector Machine)

If you liked this article, please hit the ❤ icon below

Data Science (Python) :: Naive Bayes

If you liked this article, please hit the ❤ icon below

Written by Sunil Kumar SV