Can machines be taught to correctly distinguish between loans and credit cards ?

Antony Paulson Chazhoor
Aug 22 · 1 min read

Saving time is what machines do best. Imagine having the super power to find within seconds material in which you are interested, from among a collection of numerous random posts.

This is exactly what my project worked towards, by using advanced natural language processing tools to correctly identify topics to which a reddit post belongs.

Two highly similar topics were chosen for this project(“Loans” and “Credit Cards”). This was to build a model which could strongly differentiate , even between similar topics. The NLP techniques first identified most frequent words within posts and their count in each individual post. Following this a Logistic regression ML model, Naïve Bayes model and a neural network model was trained on a random subset of the scraped data.

The Logistic regression model correctly differentiated between posts achieving an accuracy of 95%. Naive Bayes & neural networks were not far behind with a classification accuracy of close to 93%. Overall the project was highly successful and it served as a great starting point to classify texts.

The resultant machine learning model could additionally be adapted for post filtering, post identification, etc. Further analysis could also be done to identify associative keywords for various topics.

Project Links:

Please view these in sequence to understand the project in its totality

  1. Presentation to get a high level overview
  2. Data extraction from Reddit
  3. Exploratory Data Analysis
  4. Machine Learning model application and evaluation
Antony Paulson Chazhoor

Written by

Data Engineer @ View | Data Scientist | Problem Solver | Solution oriented insight builder

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade