Member-only story

Implementing a Naive Bayes classifier for text categorization in five steps

The Naive Bayes classifier guide I wish I had before

gustavo
TDS Archive
8 min readFeb 28, 2019

--

Thomas Bayes (1701–1761) author of the Bayes’ theorem

Naive Bayes is a learning algorithm commonly applied to text classification.

Some of the applications of the Naive Bayes classifier are:

  • (Automatic) Classification of emails in folders, so incoming email messages go into folders such as: “Family”, “Friends”, “Updates”, “Promotions”, etc.
  • (Automatic) Tagging of job listings. Given a job listing in raw text format, we can assign it tags such as: “software development”, “design”, “marketing”, etc.
  • (Automatic) Categorization of products. Given a product description, we can assign it into categories such as: “Books”, “Electronics”, “Clothing”, etc.

The remainder of this article will provide the necessary background and intuition to build a Naive Bayes classifier from scratch, in five steps.

Step 1. Identify the prerequisites to train a Naive Bayes classifier

As seen before, the applications of the Bayes classifier for text classification are endless. The only prerequisite is to have an existing set of examples for each category (class) that we wish to…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

gustavo
gustavo

Written by gustavo

Data Science @ Medium. Views are my own.

Responses (2)