NLP-Text Classification
Explore different methods
Text classification is one of the important NLP tasks. I’ll not go into the details of what text classification is.
We’ll discuss how to create multiclass text classification models using different methods.
I’ve not shared much details about every method here because then it would be very exhaustive article. I’ve shared notebook links to refer.
Requirement
Create a text classification api which accepts a news article or a sentence from news article and classifies it into POSITIVE, NEGATIVE or NEUTRAL sentiment.
Dataset
Data is collected using my project news_api and annotated using doccano.
Model Building
Input vectors used : TF-IDF, word embeddings from distilbert, word embeddings from sentence transformer
Model techniques used : Custom ML using sklearn, Custom Neural Network using keras, Custom Neural Network using pytorch, Custom Neural Network using hugging face transformers trainer api
Using combinations of above I’ve created 7 different models:
notebooks:
1.Text Classification using TF-IDF and Pycaret
2.Text Classification using TF-IDF and custom Machine Learning
3.Text Classification using TF-IDF and custom Neural network using Keras
4.Text Classification using distilbert embeddings and custom Machine Learning
5.Text Classification using distilbert embeddings and Neural network using Pytorch
6.Text Classification using distilbert embeddings and Neural Network using huggging face trainer api
7.Text Classification using sentence transformer embeddings and custom neural network using Pytorch
I’ve created a complete end to end project for text classification model building to deployment. The project is production ready. You can refer it here.
The main challenges I’ve solved in this project:
- Create model building code for different methods. Anyone can just pick any method to start with.
- Create production ready code for text classification
If you liked the article or have any suggestions/comments, please share them below!
Let’s connect and discuss on LinkedIn