This article is part of the Academic Alibaba series and is taken from the paper entitled “Sentiment Classiﬁcation towards Question-Answering with Hierarchical Matching Network” by Chenlin Shen, Changlong Sun, Jingjing Wang, Yangyang Kang, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, and Guodong Zhou, and accepted by EMNLP 2018. The full paper can be read here.
In today’s highly-connected world, user reviews and comments posted online can make or break a product. In light of this, opinion mining (or sentiment analysis, as it’s known in the industry) is used to identify the user sentiment orientation of a product, brand, service, etc. Key information is analyzed by monitoring online textual data, such as reviews and social media messages, and has been used in both academic and industrial applications.
Sentiment analysis fundamentally classifies sentiment polarity as either positive or negative, and can be applied at both sentence-level and whole document-level. In e-commerce environments, including Amazon and Taobao, a recent development has seen the inclusion of a new customer questions and answers form, where potential customers ask questions about the target product or service and other, more experienced users provide answers. User-oriented question-answer (QA) text pairs can carry rich sentiment information and prove more informative, convincing, and reliable than traditional reviews as the answers are provided by users who already purchased the target item.
In this article, the Alibaba tech team propose a novel task/method to address QA-style sentiment analysis. In particular, the team created a high-quality annotated corpus with specially designed annotation guidelines for QA-style sentiment classiﬁcation. On this basis, a three-stage hierarchical matching network is proposed to explore deep sentiment information in a QA text pair.
Firstly, both the question and answer texts are segmented into sentences and a number of [Q-sentence, A-sentence] units are constructed in each QA text pair. Then, by leveraging a QA bidirectional matching layer, the proposed approach can learn the matching vectors of each [Q-sentence, A-sentence] unit. Finally, the importance of the generated matching vectors is characterized through a self-matching attention layer. The detailed architecture of the QA bidirectional matching mechanism is outlined below.
In order to fully test the matching mechanism, the team collected QA text pairs from the world’s biggest e-commerce company, Taobao. With over 10,000 QA text pairs from the beauty, shoe, and electronic domains, respectively, four sentiment-related categories were defined: positive, negative, conflict (both positive and negative sentiment), and neutral (neither positive nor negative sentiment).
Annotation guidelines, comprising two main groups, are also proposed to guarantee a high annotation agreement. These include guidelines that aim to distinguish the neutral and non-neutral categories and guidelines that aim to distinguish the positive and negative categories.
Experimental results, compared with a number of state-of-the-art baselines, demonstrate the impressive effectiveness of the proposed approach for QA-style sentiment classiﬁcation. Specifically, approaches based on matching strategy perform well when question and answer include different types of information, which is a unique challenge for QA-style sentiment mining. Furthermore, in comparison to other approaches, the proposed method performs well when dealing with conflict instances.
Moving forward, the tech team at Alibaba aims to investigate other network structures to explore deeper information within each QA text pair. Moreover, the effectiveness of the proposed QA-style sentiment classification in other languages will also be a focus in the future.
The full paper can be read here.