Neural Approaches to Advanced Sentiment Analysis

Ruidan He, Wenya Wang and Daniel Dahlmeier (Machine Learning Singapore)

Published in

SAP AI Research

7 min readDec 4, 2017

Sentiment analysis has grown to be one of the most active research areas in natural language processing because of the increasing volume of opinionated data recorded in digital forms. It has a wide range of applications in different businesses and social domains, helping both companies and individuals to better understand opinionated information for decision making. Recently, more advanced tasks such as aspect-based sentiment analysis (ABSA) are getting popular. ABSA is based on the idea that an opinion consists of a sentiment and a target. An opinion without its target being identified is of limited use. Thus, it aims to discover sentiments on entities and/or their aspects. For example, in a restaurant review “I have to say they have one of the fastest delivery times in the city” the aspect term is “delivery times”, on which a positive opinion is expressed. Usually, the first step of ABSA is aspect extraction, which aims to extract aspect terms from the input text. In this post, we summarise our recent research work on the problem of building effective neural models for aspect extraction under both supervised and unsupervised settings. This work was done under the SAP Industry Ph.D. program in collaboration with the National University of Singapore and Nanyang Technological University.

Aspect extraction tackled as sequence labelling problem

In this category, the task involves the extraction of explicitly appearing aspect words and opinion words within each sentence. Taking the above-mentioned example, given the review sentence “I have to say they have one of the fastest delivery times in the city”, our task is to identify delivery times as an aspect term and fastest as opinion term. This knowledge is useful for obtaining structured opinion summarization, which offers a clear view of the main topics/aspects and their associated opinion distributions from a large amount of text. Figure 1 shows an example of structured opinion summaries about two digital cameras.

Figure 1: Visualization opinion comparison of two digital cameras

Since the targets we are extracting might consist of multiple words, we apply the BIO labeling scheme, i.e., each word in a sentence is labeled as one of the five labels: “BA” (beginning of aspect), “IA” (inside of aspect”, “BO” (beginning of opinion), “IO” (inside of opinion) and “O” (others). In this way, the task is formularized as a supervised sequence labeling problem.

We approach the problem by focusing on the syntactic dependency relations among the words within each sentence. The reason for focusing on dependencies is that there exist certain syntactic relations between aspect words and opinion words that should help to identify each other. For example, as shown in Figure 2, fish burger and tastes are ground truth aspect terms, accompanied with best and fresh as their opinions respectively. Given tastes as an aspect term, fresh can be extracted as an opinion term through a direct relation. And given burger as an aspect term, tastes can be extracted as another aspect term through the indirection relation. Based on this observation, we build a dependency-tree-based recursive neural network able to compute the high-level representation for each word that incorporates the inherent dependency relations with the others. Besides this, we also consider the sequential context interactions through a graphical model called Conditional Random Field (CRF). By combining both models in a joint structure which is trained end-to-end, we obtain promising results compared to existing methods. This work is published in EMNLP’16 as Recursive neural conditional random fields for aspect-based sentiment analysis.

Figure 2: A dependency example for sentiment analysis

Focusing on the same task, we published another paper Coupled multi-layer attentions for co-extraction of aspect and opinion words in AAAI’17. This work advances the previous method by replacing pre-processed dependency relations with an automatic attention mechanism. One limitation with the previous method is that pre-generated dependency relations are prone to errors, in particular when parsing informal texts. The incorrect syntactic structure may harm the learning process. Hence, we propose an end-to-end attention model to automatically learn the interactions among those words. Figure 3 illustrates the model architecture. The coupled attentions refer to an aspect attention and an opinion attention. They are coupled (interactive) in the learning process to enable the correlation between aspect words and opinion words. We use the attentions to select the most relevant words within each sentence regarding the aspect extraction and opinion extraction. This model does not need any linguistic resources and yet can achieve higher scores than the dependency-based model.

Figure 3: Illustration of the coupled attention model

Aspect extraction tackled as topic modelling problem

Supervised aspect extraction requires word-level labeled data for training purpose which is hard to obtain in reality. In contrast to this, our recent ACL paper An unsupervised neural attention model for aspect extraction approaches the problem in an unsupervised setting using topic modeling. In this case, given the raw unlabeled texts, the objective is to (1) extract a set of aspects (topics) represented by lists of ranked words, where the top words in each aspect are regarded as aspect terms; (2) map each sentence in the corpus to one of the discovered aspects. Figure 4 shows the high-level workflow.

In the context of understanding product reviews, the output aspects and aspect-relevant sentences could be used to construct a structured review summary. Figure 5 shows an example summary helping users to quickly understand the key information from a large number of reviews.

Figure 5: An example restaurant review summary

One major challenge of this task is that the inferred aspects tend to be incoherent — aspects often consist of unrelated or loosely related aspect terms. Unlike conventional topic models such as variants of Latent Dirichlet Allocation (LDA) that work on discrete word types, we proposed a simple yet effective neural architecture that substantially improves the coherence of inferred aspects.

As illustrated in Figure 6, in our attention-based aspect extraction (ABAE) model, we represent words with embeddings and the goal is to learn a set of aspect embeddings, where each aspect can be interpreted by its nearest words in the embedding space. The model takes a review sentence as input, mapping each word to a pre-trained word embedding. We first filter the word embeddings using an attention mechanism by down-weighting non-aspect words, and represent the sentence as weighted summation of filtered word embeddings. Then, we try to approximate the sentence embedding as a linear combination of aspect embeddings. The training process is analogous to autoencoders, where we use dimension reduction to extract the common factors among embedded sentences and reconstruct the sentence as weighted summation of aspect embeddings. The attention mechanism de-emphasizes words that are not part of any aspect, allowing the model to only focus on aspect words.

Figure 6: Illustration of the ABAE model

It is worth noting that the neural attention model is trained in an unsupervised setting, where the objective is just to minimise the reconstruction error. Surprisingly, we found that the attention mechanism learned in this condition still works very well and is able to focus on informative aspect words. In our experiments, we evaluated the model using two criteria: (1) Is it able to find meaningful and semantically coherent aspects; (2) Is it able to accurately map an input sentence to one of the discovered aspects. According to our experimental results, our model can significantly and consistently outperform prior topic models on various evaluations tasks.

Although in this work we focus on aspect extraction, a specific task under sentiment analysis, we indeed considered a general topic modelling problem that aims to extract the main topics from unlabelled texts. Thus, our model could potentially be applied to similar tasks with different types of text.

Aspect extraction is a major step towards fine-grained sentiment analysis and in literature has been formulated as different tasks such as sequence labeling or topic modeling. In this blog post, we briefly introduced three of our recent works in this area, approaching the problem under different settings. For the detailed model explanation and experimental results, please refer to our papers. We wish that our works will inspire future research on aspect extraction and help people from the industry build effective systems for advanced sentiment analysis.

Neural Approaches to Advanced Sentiment Analysis

Ruidan He, Wenya Wang and Daniel Dahlmeier (Machine Learning Singapore)

Aspect extraction tackled as sequence labelling problem

Aspect extraction tackled as topic modelling problem

Written by SAP AI Research