Sentiment Track: Latest Trends from EMNLP 2022

RISHABH TRIPATHI
The Observe.AI Tech Blog
6 min readFeb 3, 2023

EMNLP 2022 spotlighted a plenty of innovative works on sentiment track including works in multimodal and multilingual spaces.

A couple of works on sentiment analysis were presented demonstrating the training of auxiliary tasks that complement each other. One such work is discussed below:

UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition

  1. Hu et al. presented their work titled “UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition” and demonstrated that sentiment and emotion classification tasks when performed separately do not exploit the complementary knowledge shared between them.
  2. UniMSE represents a scheme leveraging T5 to perform unified multimodal sentiment analysis and emotion recognition.
  3. An inter-modal contrastive learning is carried out to minimize the intra-class variance and maximize the inter-class variance.
  4. This paper provides a psychological perspective to demonstrate that jointly modeling sentiment and emotion is feasible and reasonable as it produces SOTA results.
  5. However, the generation of universal labels only considers textual modality, without considering acoustic and visual modalities.

Traditionally, most of the applications related to sentiment analysis were carried out using aspect-based algorithms. Hence, there is a need to understand how much relevance does the aspect-based sentiment analysis carry as of today. Does it still help in fulfilling the use-cases across industries? One such work was presented by Fu et. al at the event.

→ Entity-level Sentiment Analysis in Contact Center Telephone Conversations

  1. Fu et. al from Dialpad presented their work titled “Entity-level Sentiment Analysis in Contact Center Telephone Conversations”. The proposed work demonstrated entity-level sentiment analysis in contact center telephonic conversations by leveraging the relationship between entity word and opinion word.
  2. For instance, “I work at Google and I love it a lot” contains Google as the entity word and love as the opinion word.
  3. The work proposed two approaches:
    CNN-based model with heuristics runs a general sentiment analysis model that classifies the sentiment of a given utterance and also extracts the keywords that cause that sentiment and treats these sentiment keywords as opinion word candidates. Then we employ a set of linguistic heuristics that identify the opinion words that are associated with the entities mentioned in the input.
    DistilBERT-based model: The performance can be further improved by fine-tuning DistilBERT on sentiment datasets like SST and using the fine-tuned checkpoint to perform sentiment-entity classification.
  4. Moreover, the observation showed that BERT models can effectively model relationships between words. However, these models are not good at modeling relationships between words when there are too many words.
  5. Furthermore, the NER component of DistilBERT-based model has some limitations while detecting product and organization type entities. It is more biased towards detecting the entities that appear more frequently in the training data and misses rare entities.

Some of the interesting research works were observed in the bucket of papers presented at the event which demonstrated various ways to perform document-level sentiment classification. We discuss one such interesting paper:

→ Semantic Simplification for Sentiment Classification

  1. Jiang et al. presented their work titled “Semantic Simplification for Sentiment Classification” demonstrated sentiment classification at the document level by leveraging AMR (Abstract Meaning Representations).
  2. At the document level, the identification of sentiment polarity becomes difficult because it has complex semantics.
  3. The procedure carries out simplification of semantic complexity by representing the document in the form of AMR graph which is obtained by S2S-AMR parser.
  4. The AMR-based sentence representation is transformed into a simplified graph which is further transformed into sequence representation which represents the simplified clause.
  5. The idea is to keep the opinion the same but the semantics simpler.

While multiple works were presented demonstrating inventions, it is interesting to note that a few research works at the event followed innovations. The aforementioned works brought some innovative strategies explaining ways to train a machine learning model instead of completely looking for a new architectural approach. One such work was observed in the sentiment track by Ranjan et al.

Progressive Sentiment Analysis for Code-Switched Text Data

  1. Ranjan et al. titled “Progressive Sentiment Analysis for Code-Switched Text Data” and demonstrated the sentiment analysis for code-switched data using a progressive training paradigm with the assumption that there are two languages:
    — S: source language: gold labeled & resource rich language
    T: target language: unlabeled & code-switched data
  2. The motivation is to leverage transfer learning to move from resource-rich language to code-switched data.
  3. The procedure starts with pretraining on source language data and then multiple buckets are created for the code-switched data.
  4. Using the model (trained on source data), the model makes predictions on first bucket samples. Further, the predictions of the first bucket and source language data are used to re-train the model and predictions are made on the data contained in the second bucket. The process continues until the model is trained on complete source and target language data. At each step, it is suggested to sample a fixed fraction of data points of each class from the buckets.
  5. While the proposed approach shows significant improvements over various baselines, it is worth noting if the number of samples in buckets are very disproportionate, the progressive learning might not result in significant improvement.

Some of the research works were observed on robustifying the language models to perform a certain task. One such works from the sentiment track includes the research work carried out by Raedt et al.

→ Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals

  1. Raedt et al. presented their work titled “Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals” demonstrated that sentiment classification can be robustified by maximally exploiting few counterfactuals.
  2. With the assumption that using counterfactual samples during training improves the OOD (out-of-domain distribution) generalization of classifiers, the authors decided to exploit a limited number of counterfactuals as generating counterfactuals manually can be expensive.
  3. In the work, a few data points are sampled and their counterfactuals are generated such that there is change in polarity. For instance, a sample “one of the worst ever scenes” signals a negative sentiment but its counterfactually revised sample “one of the wildest ever scenes” portrays a positive sentiment.
  4. The proposed approach improves the robustness of the model.
  5. However, these models produce counterfactual samples directly in the encoding vector space, and they thus cannot easily be interpreted. One could easily train a decoder to reconstruct the input so interpretability of counterfactual samples is questioned.

How can we train a language model to completely understand only a certain task at various hierarchical levels? The answer to this question was beautifully displayed by Fan et al. in their work.

→ Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis

  1. Fan et al. presented their work titled “Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis” demonstrating the possibility of a sentiment-aware language model that is carried out by pre-training at sentence-level.
  2. The proposed work makes the assumption that most PLMs (Pre-trained Language Models) are sub-optimal for sentiment analysis as they capture sentiment at word-level while not actually considering sentiment at sentence-level.
  3. Senti-WSP, a novel sentiment aware PLM is proposed having combined word-level and sentence-level sentiments.
  4. At the word-level pre-training stage, a generator-discriminator network is trained to enhance PLM’s knowledge of word-level sentiment. Furthermore, a sentence-level pre-training is carried out to strengthen the discriminator via optimizing a contrastive loss and enhances the learning of sentiment at sentence-level.
  5. The proposed work achieves SOTA over various sentence-level and aspect-level sentiment classification benchmarks.
  6. The generator-discriminator network follows training a MLM with a masked word in the input sentence. The masked word corresponds to the word containing sentiment. The generator replaces the mask word while the discriminator corrects the replacement. The trained discriminator is used to perform sentiment classification via optimizing a contrastive loss.
  7. However, SentiWSP, like most of the current state-of-the-art pre-training models, requires relatively larger computation resources.

--

--