Active Learning and Visual Analytics for Stance Classification with ALVA
The content described in this article was originally published in the October 2017 issue of ACM TiiS (Vol 7, Issue 3).
Besides scientific papers, ads, and retweets, there is a lot of textual content on the Internet in which people express their thoughts, opinions, and emotions. It is natural for us to involve subjectivity in our spoken and written language. When sufficiently large amounts of such data become available, as has happened in the last five to ten years, researchers and practitioners of machine learning and natural language processing (NLP) are able to conduct pretty awesome automatic analyses of these texts.
For example, sentiment analysis has been traditionally applied to customer reviews from platforms such as Amazon to detect positive and negative opinions about some products. With the adoption of social media, this method has been used to detect the overall trends in public opinions or let the users track their moods over time. It has even been used for a hilarious application of making stock trading decisions automatically based on the sentiments in tweets by a famous politician.
While improving and applying sentiment analysis techniques is certainly a lot of fun for researchers and valuable in industry, our collaborators and ourselves have been interested in a related, but a bit different type of analysis, namely, stance analysis.
For example, consider the following sentence: “Well, I have to admit that I am not completely sure about this” — there’s something going on here that is tricky to describe in terms of positive/negative sentiments, isn’t it? In NLP, the task of stance classification is usually understood as automatic detection of pro/contra position towards a certain topic or opinion in text, for instance, as in a sentence like “No one would argue with your reasoning about the electoral college issues, Jim”. In linguistics, stance taking is usually defined more broadly than just agreement/disagreement, and it’s been precisely the focus of our research project StaViCTA to investigate the theory behind stance taking in written language, implement computational analyses for stance classification, and provide means for exploratory visual analysis of stance in text data. Our collaborators in linguistics settled on a list of ten stance categories of interest, such as “Concession and Contrariness” and “Uncertainty,” and we were heading towards the implementation of an automatic classifier.
Then we faced some problems.
In order to develop a stance classifier, we needed training data in the form of utterances or sentences labeled with different stance classes. The available labeled data sets included only the pro/contra labels though, as opposed to our ten stance categories. In addition, our categories of interest were not mutually exclusive. For example, the same text chunk could be labeled with both “Uncertainty” and “Prediction” at the same time. Technically, we had to support the multi-label classification task, and we first had to annotate the training data set for stance categories (some of which were also very sparse). To make best use of our limited resources, our collaborators in NLP suggested following the active learning approach: we had to collect some initial labeled data, train the first version of the classifier, let the classifier choose which unlabeled data items to annotate next, annotate — train — select the next batch, and so on and so forth.
With all of these considerations and peculiarities in mind, we decided to develop a comprehensive environment, which we called ALVA, to provide a user interface to the data annotators, interact with the stance classifier according to the active learning approach, and support visual analysis of the annotated data and other metadata generated during the annotation process. For example, our collaborators wanted to track the statistics about agreement between several annotators (inter-annotator agreement) or the same annotators between several annotation rounds (intra-annotator agreement), and also to track the performance of the classifier after each active learning round using cross-validation data. Incidentally, for our annotators we involved researchers with training in linguistics for data annotation; we felt that computer scientists would not often be the ideal choices to annotate text data with categories like “Hypotheticals” or “Volition” :)
From the point of view of information visualization, the main challenge we faced was about visually representing all of the annotations, which could have up to ten stance categories present at the same time. Imagine that you want to use color to represent sentiment classification results for three categories, positive/neutral/negative, which do not occur at the same time. In this case, you would probably select something like green, gray, and red colors to encode the categories, so three unique colors would be sufficient. In our case, ten non-exclusive stance categories would require 2^10 colors, which was obviously not feasible.
Our solution involved designing a new visual representation called CatCombos (which stands for “category combinations”) that focuses on groups of data annotations with the same sets of categories rather than individual annotations. In the screenshot above, the CatCombos representation is used in the top right part of the interface as the main view. The users can quickly detect the existing annotations with multiple stance categories present at the same time and investigate them in detail. For instance, in the screenshot the user has hovered over an annotation (highlighted with an orange dot at the top of the figure) in the group with the categories “Concession and Contrariness”, “Hypotheticals”, and “Need / Requirement” present. It turns out that there is another annotation created for the same utterance with a different set of selected categories. The edge to the other annotation item and the item itself are then highlighted in yellow. The user has also right-clicked on the annotation item to trigger an update in the utterances table view, which is situated under the main CatCombos view. Here, the user can focus on the actual text of the utterance and also see all of the corresponding annotations.
The interface of ALVA also supports many other visualization and interaction options. For instance in the screenshot above, the user has clicked a cell in the category co-occurrence matrix at the bottom left part of the interface to highlight all of the annotations with both “Concession and Contrariness” and “Uncertainty” present with blue color in the main CatCombos view.
Our collaborators have been using ALVA for quite a long time for annotation, active learning, and visual analysis of the annotated data. While we have applied ALVA only to our stance-related annotation tasks so far, we are looking forward to its future applications for other text annotation tasks at the sentence/utterance level and, perhaps, other levels, too.
You can read more about this work in our article in ACM Transactions on Interactive Intelligent Systems. The full citation is:
Kostiantyn Kucher, Carita Paradis, Magnus Sahlgren, and Andreas Kerren. 2017. Active Learning and Visual Analytics for Stance Classification with ALVA. ACM Trans. Interact. Intell. Syst. 7, 3, Article 14 (October 2017), 31 pages. DOI: https://doi.org/10.1145/3132169