Twitter Emotion Dataset — Providing valuable insights into human behavior, communication patterns, and societal trends

Mohamad Mahmood
Lexiconia
Published in
3 min readJun 23, 2024

Twitter is one of the most widely used social media platforms, with a large and active user base. Analyzing emotional content on social media platforms like Twitter can provide valuable insights into human behavior, communication patterns, and societal trends.

Some Twitter emotion datasets that are available in the Internet:

Twitter Emotion Corpus (TEC Dataset 2012)

TEC is a dataset of tweets annotated with emotional categories. It was developed in 2012 and contains over 21,000 tweets labeled with one of the following emotions: anger, disgust, fear, joy, sadness, and surprise. Each tweet in the dataset is annotated with a single emotional category label.

https://aclanthology.org/S12-1033/

Smile Project (Smile Dataset 2016)

This dataset is collected and annotated for the SMILE project http://www.culturesmile.org. This collection of tweets mentioning 13 Twitter handles associated with British museums was gathered between May 2013 and June 2015. It was created for the purpose of classifying emotions, expressed on Twitter towards arts and cultural experiences in museums. It contains 3,085 tweets, with 5 emotions namely anger, disgust, happiness, surprise and sadness. See paper “SMILE: Twitter Emotion Classification using Domain Adaptation” for more details of the dataset.

https://www.kaggle.com/datasets/ashkhagan/smile-twitter-emotion-dataset

CrowdFlower (CrowdFlower Dataset 2016)

Created by CrowdFlower at 2016, the The Emotion in Text Dataset of tweets labelled with emotion. Categories: empty, sadness, enthusiasm, neutral, worry, sadness, love, fun, hate, happiness, relief, boredom, surprise, anger., in English language. Containing 40 in CSV file format.

https://metatext.io/datasets/the-emotion-in-text

Cleaned Balanced Emotional Tweets (CBET Dataset 2017)

This work proposes several lexical and learning based methods to classify the emotion of test tweets and proposes a set of Näıve Bayes classifiers, each corresponding to one emotion, using unigrams as features, as the best-performing method for the task.

https://www.semanticscholar.org/paper/Lexical-and-Learning-based-Emotion-Mining-from-Text-Shahraki-Zaiane/da0708bf99942992bcbfe1919d6755bc8168d46e

Workshop on Computational Approaches to Subjectivity, Sentiment, and Social Media (WASSA Dataset 2017)

The 8th Workshop on Computational Approaches to Subjectivity, Sentiment, and Social Media Analysis (WASSA 2017) was held in conjunction with EMNLP 2017. This workshop focused on research related to automatic Subjectivity and Sentiment Analysis (SSA) within the context of affective computing and natural language processing (NLP)

https://aclanthology.org/events/wassa-2017/

SemEval-2018 Task 1: Affect in Tweets

The SemEval-2018 Task 1: Affect in Tweets includes an array of subtasks on inferring the affectual state of a person from their tweet. For each task, the authors created labeled data from English, Arabic, and Spanish tweets. The individual tasks are: 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification.

SemEval-2018 Task 1: Affect in Tweets — ACL Anthology

🤓

--

--

Mohamad Mahmood
Lexiconia

Programming (Mobile, Web, Database and Machine Learning). Studies at the Center For Artificial Intelligence Technology (CAIT), FTSM, UKM, Malaysia.