How to handle Emoji β€˜πŸ˜„β€™ & Emoticon β€˜ :-) ’ in text preprocessing?

Handling emojis and emoticons while performing text preprocessing tasks in Natural Language Processing(NLP)

Paritosh Mahto
Geek Culture

--

Image By Paritosh Mahto

Natural Language Processing (NLP) is one of the hot areas in machine learning for research nowadays, few applications of NLP are Sentimental Analysis, Chatbots & Virtual Assistants, Text Classification & Extraction, Auto-Correction e.t.c.To work on all these applications, the first task that we have to perform on the given text data is Text Analysis.

Text Preprocessing is one of the crucial steps in Text Analysis in which we perform many tasks on the text data for example Text Cleaning, Lower casing, Tokenization, Stemming, Lemmatization etc. In the text cleaning task, we try to remove stop words, special characters, emoji, emoticon, punctuations, spelling correction, URL, etc. from the raw text data.

After conducting all these tasks we have the pre-processed texts which are more predictable and analyzable and because of these texts, the machine learning algorithms also perform better.

Nowadays, many texts data contains emojis and emoticons due to fast-growing digital communication. Handling these texts which have emojis and emotions is becoming a recurring task. Let's understand…

--

--