Chapter 1 : Supervised Learning and Naive Bayes Classification — Part 1 (Theory)

Savan Patel
Apr 30, 2017 · 4 min read
Image for post
Image for post

Welcome to the stepping stone of Supervised Learning. We first discuss a small scenario that will form the basis of future discussion. Next, we shall discuss some math about posterior probability also known as Bayes Theorem. This is core part of Naive Bayes Classifier. At last, we shall explore sklearn library of python and write a small code on Naive Bayes Classifier in Python for the problem that we discuss in beginning.

This chapter is divided into two parts. Part one describes how naive bayes classier works. Part two consist of a programming exercise in Python using sklearn library that provides Naive Bayes Classifiers. Later we discuss accuracy for the program that we train.

Imagine two people Alice and Bob whose word usage pattern you know. To keep example simple, lets assume that Alice uses combination of three words [love, great, wonderful] more often and Bob uses words [dog, ball wonderful] often.

Lets assume you received and anonymous email whose sender can be either Alice or Bob. Lets say the content of email is “I love beach sand. Additionally the sunset at beach offers wonderful view”

Can you guess who the sender might be?

Well if you guessed it to be Alice you are correct. Perhaps your reasoning would be the content has words love, great and wonderful that are used by Alice.

Now let’s add a combination and probability in the data we have.Suppose Alice and Bob uses following words with probabilities as show below. Now, can you guess who is the sender for the content : “Wonderful Love.”

Image for post
Image for post
Probability of word usage of Alice and Bob

Now what do you think?

If you guessed it to be Bob, you are correct. If you know mathematics behind it, good for you. If not, don’t worry we shall do it in next section. This is where we apply Bayes Theorem.

Bayes Theorem

Image for post
Image for post

It tells us how often A happens given that B happens, written P(A|B), when we know how often B happens given that A happens, written P(B|A) , and how likely A and B are on their own.

  • P(A|B) is “Probability of A given B”, the probability of A given that B happens
  • P(A) is Probability of A
  • P(B|A) is “Probability of B given A”, the probability of B given that A happens
  • P(B) is Probability of B

When P(Fire) means how often there is fire, and P(Smoke) means how often we see smoke, then:

P(Fire|Smoke) means how often there is fire when we see smoke.
P(Smoke|Fire) means how often we see smoke when there is fire.

So the formula kind of tells us “forwards” when we know “backwards” (or vice versa)

Example: If dangerous fires are rare (1%) but smoke is fairly common (10%) due to factories, and 90% of dangerous fires make smoke then:

P(Fire|Smoke) =P(Fire) P(Smoke|Fire) =1% x 90% = 9%P(Smoke)10%

In this case 9% of the time expect smoke to mean a dangerous fire.

Now can you apply this to out Alice and Bob example?

Naive Bayes Classifier

Naive Bayes classifier calculates the probabilities for every factor ( here in case of email example would be Alice and Bob for given input feature). Then it selects the outcome with highest probability.

This classifier assumes the features (in this case we had words as input) are independent. Hence the word naive. Even with this it is powerful algorithm used for

  • Real time Prediction
  • Text classification/ Spam Filtering
  • Recommendation System

So mathematically we can write as,

If we have a certain event E and test actors x1,x2,x3, etc.

We first calculate P(x1| E) , P(x2 | E) … [read as probability of x1 given event E happened] and then select the test actor x with maximum probability value.

Image for post
Image for post

I hope this explains well what Naive Bayes classifier is. In next part we shall use sklearn in Python and implement Naive Bayes classifier for labelling email to either as Spam or Ham. Comment in section below if you need any help or have any suggestions.

Code and implement the email classification into spam and non spam here( Part 2 of chapter 1).

Read about Support Vector Machine in chapter 2 here.

Machine Learning 101

Machine Learning articles for beginner to intermediates.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store