Every Colour You Are: Stance Prediction and Turnaround in Controversial Issues
This post describes the ACM Web Science paper titled “Every Colour you are: Stance Prediction and Turnaround in Controversial Issues.” It is a joint work between Ricardo Baeza-Yates, Mounia Lalmas, and me (@carnby). Slides are available.
The picture above is from the Gabriela Mistral art center in the most important street in Santiago, the capital of Chile. This wall was used by several protesters in a recent social unrest to express their views on several issues — the social unrest was called Social Explosion (“Estallido Social” in Spanish). We think that such expressions share similarities with what people express on Twitter. Because these street art manifestations are done by people from specific socio-economic and demographic groups, in the same way as Twitter is used more prominently by people from certain demographic groups.
This particular image is related to abortion rights in the country, which is the focus of study in our work. Abortion is a complicated subject to discuss given the political circumstances around it and also the deep private matters it conveys. Each country has its own legislation on this matter, and therefore, different contexts of discussion, not to mention the cultural differences between countries.
We analyzed the discussion in two neighboring Latin American countries: Chile and Argentina. Although they share many cultural similarities, they have several differences in terms of abortion legislation. Until 2017, Chile had one of the most severe abortion laws in the world, it was completely forbidden. In 2018, Argentina legislated a free abortion law, which was rejected by the Congress. These years were subject to legislative discussion by politicians. But the streets, and social networks, were also subject to discussion and manifestation. The following pictures show different protests in defense and in opposition to abortion rights:
We observe how stance is expressed in physical ways, not just in vocabulary. Moreover, these concentrations were not as big at the beginning of their respective movements. They grew with time, as people decided to update their opinions and make them more explicit.
Our research questions are the following.
- How people make use of new technologies to express their positions on controversial issues?
- Which demographic and profile factors characterize opinion change?
Both questions have been answered extensively in the study of physical world manifestations, however, here we focus on online manifestation.
5. This is a visualization of the dataset we use to answer our research questions. It comprises 4 years of discussion (2015–2018):
The black curve represents the tweet volume in logarithmic scale. Each box on the bottom is a legislative event, with all events up to 2017 being held in Chile, and the rest in Argentina. The peaks of discussion happen around these events.
For every year we show a list of its most associated terms. These include hashtags and accounts related to the legislation, such as #aborto3causales (which means abortion on three grounds) and clandestino (which refers to abortion happening regardless of legal status), some politician names (such as Piñera, who was running for president in Chile at the time). Perhaps the most interesting terms appear in 2018, where you can see the word kerchief (pañuelo in spanish) and a green heart emoji. It’s not common to see emoji in a wordcloud, isn’t it?
This is a schematic diagram of the methodology we apply to the dataset:
First, we preprocess the dataset to define features for every user. These include content features, network features, and profile features based on reported information. Second, we label some users in the dataset using rules and pattern matching. In this way, some users have a country, a gender, age, and a stance on abortion. With respect to abortion, we consider two stances: in defense to abortion rights, and in opposition to abortion rights. We defined a list of seed keywords to identify users with explicit views.
The next step is to build classifiers for each attribute. We train all classifier using the labeled profiles and then propagate the labels to the rest of the dataset.
From this setup, we answer RQ1 by analyzing how the stance classifier weighted the different features we identified in previous steps.
To answer RQ2, we identify two periods of relevant legislative events, and apply the classifier at each period. Then we perform regression analysis on these differences, having as regression factors the demographic and content characteristics of each user.
Let’s see the results.
The stance classifier achieved a nice precision (0.88) in the unlabeled dataset. Interestingly, the stance distribution is not the same in both countries:
We considered three different stances. Defense and opposition to abortion rights, and undisclosed — those cases where the classifier was not confident about the prediction.
This set of bar charts summarizes the most important stance classifier features in the dataset, as well as their frequency and mean importance by type of feature:
There is high importance in verbs such as to decide, or to kill, which are commonly associated to each stance.
However, what draws our attention is the usage of colored hearts. Here we see that four features containing these hearts are within the most predictive in the dataset: 💚 and 💙 in tweet content, and the same emoji in the profile name.
Vocabulary related to abortion is where we expect it to be. For instance, the defense stance is strongly associated to women rights and decriminalization. The opposition stance is strongly associated to words related to life, death, and also to words about humanity.
💚 are associated to defense of abortion rights, whereas 💙 are associated to opposition of abortion rights. This is coherent with the protest pictures we saw at the beginning of this post.
Try searching for abortion-related hashtags (#seraley — #itwillbelaw — and #salvemoslas2vidas — #letssavethe2lives — ) on Twitter. You will see how people uses plain hashtags to label their profiles, but they also use emoji (and its colors) to show adherence to a stance in their names and biography.
Regarding our question about opinion change, we analyzed two periods in consecutive years that had similar traits: both predated legislative events, first in Chile (May, June, and July, 2017), then in Argentina (same months in 2018). We analyzed 12K users who participated in the discussion at both periods.
The following tables show the transition between stances from one period to another:
In Chile, a majority of users in opposition and defense stances remained in their previous stances. In Argentina, most users moved toward defense. These differences imply that legislative events may entice people to be more explicit in their views, more than changing from one stance to the other.
To answer our question, we focused on the difference in probabilities rather than between categories. Note that Chile has a distribution centered in zero, whereas Argentina has a skewed distribution with change toward the defense stance. We performed a linear regression over the differences in stance adherence. Here we summarize the main insights:
- Men are less likely to update their views than women, which may be explained due to how abortion is a gendered issue.
- Chileans are less likely to update their views, which may be explained due to the second event happening in Argentina, not Chile.
- Older people change less their opinion than younger people.
- People in opposition to abortion rights were more likely to update their opinion.
Next we discuss what we have learned from these results.
First, we observed that colored hearts acquired a new meaning in this particular discussion, without any relationship to previous meanings.
An implication for computer scientists is that there are new ways of expression that have high predictive power. Even though discussion happens mainly in text, you may need to look at visual features too. These features may be embedded in text, just like this 💜.
Second, social scientists know that women have been using color for expressing their political views since the time of the suffragettes. Now, they also know that physical manifestations find online equivalents.
Third, we found that change of opinion can be measured in terms of magnitude and direction for several demographic groups. This has implications for policy makers, as there are events or interventions that may trigger these changes, and it may be of value to measure the extent of them.
To conclude, understanding how people express and how they react to on-going events has implications on policy making and physical manifestation. This is important because the abortion debate has not ended, and new debates are constantly emerging or re-emerging.
Thank you for reading this post!