Email classification into categories — Different Approaches using NLP.

Wired Wisdom
DataSeries
Published in
4 min readAug 5, 2020

--

This story will walk through the different approaches for email classification to different categories using natural language processing techniques.

Source

Email Categorization

Time is money! Right? Yes it is. In our career 10% or more of our office time will be used for reading emails and responding to it. Sometimes we have to search back the emails and read it out again and reply. When we get a bunch of emails (ham or spam) it is very time consuming to find out which is required and which is not.

There are many open source projects for classifiying emails to spam and ham. But not much work can be seen in case of email classified into different categories which is very much required. The main thing according to me is lack of proper data.

Let’s discuss about how we can tackle this issue from different approaches done by different open source contributors.

Latent Dirichlet Allocation (LDA)

LDA is the most widely used NLP technique to determine topics from documents. It’s a way of automatically discovering topics that these sentences contain. LDA is a bag-of-words algorithm that helps us to automatically discover topics that are contained within a set of documents.

--

--