Analysis: Sexism, homophobia, and anti-Western narratives on Russian social media

Examining the main themes of Orthodox militant group Sorok Sorokov’s online posts on VKontakte in Russia

@DFRLab
DFRLab
13 min readJun 3, 2020

--

Logo of VKontakte on the screen of a payment terminal in this picture illustration taken May 16, 2017. (Source: Valentyn Ogirenko/Reuters via Reuters Connect)

As part of our effort to broaden expertise and understanding of information ecosystems around the world, the DFRLab is publishing this external contribution. The views and assessments in this open-source analysis do not necessarily represent those of the DFRLab.

This article presents the preliminary findings of a content analysis of sexist, homophobic and transphobic language used on Russian-language social media platforms by pro-Kremlin, far-right organizations, paramilitaries, pro-government militias, and Orthodox vigilante groups. Using data science and machine-learning tools, the project analyzed gender-related hate speech in the two most popular social media platforms in Russia: VKontakte (VK) and YouTube.

This particular study used the activities of the Sorok Sorokov (Forty Times Forty) movement on Russian social media platform VK as a case study. According to Marlene Laruelle, a prominent researcher of nationalist and conservative ideologies in Russia, Sorok Sorokov is an Orthodox militia of the Russian Patriarchate (the office of the Russian patriarch). It is a particularly well-connected movement, nurtured from inside the Patriarchate and with the support of influential figures like Tikhon Shevkunov, rumored to be Putin’s confessor, and closely connected to the security services. Sorok Sorokov is also supported by the Moscow municipality and has direct access to the head of the Russian Orthodox Church, Patriarch Kirill of Moscow.

More broadly, this research is part of an emerging Queer Data Science movement, Gayta Science, and similar efforts to empower queer communities, trans and/or nonbinary people, women, and other groups marginalized in supremacist patriarchy-based societies.

VKontakte as a data source for hate speech

No study on expressions of hate speech and extremism in social media will be complete if it ignores VK, which is the second most popular platform in Russia by volume of users after YouTube. It is used by more than 80 percent of Russia’s active social media users. The Russian social media platform hosts a concentration of extremist groups representing a wide range of ideological stances and including old school supremacist, imperialist, and chauvinist organizations and more sophisticated New Right, Red-Brown, Third Position, and Querfront movements.

VK’s extremist base is also broadly international. After being deplatformed by Facebook, many U.S. white nationalist and alt-right communities moved to VK, which maintains comparatively lax content moderation standards, including against hateful speech. More than 100 nationalist groups on VK have members from the United States, Germany, Sweden, and other Western countries, and membership in the most popular of these groups numbers is in the thousands.

Thus, a VK-focused analysis provides a great opportunity to examine the manifestations of hate speech online, as well as the techniques of spreading Russian influence abroad, especially among conservative-minded populations.

Methodology and tools

The researchers first obtained statistical trends on the 10,000 most recent public posts by Sorok Sorokov. These posts were extracted from VKontakte at 3:38 p.m. EET on March 17, 2020, and stored on the project’s website at Harvard Dataverse. The data are stored in unpublished form, so anyone who would like to access it for research purposes should contact the author at andrejs.berdnikovs@fulbrightmail.org.

The next step was an automated content analysis of Sorok Sorokov’s language and rhetoric related to the Russian domestic violence bill (законопроект «О профилактике семейно-бытового насилия в Российской Федерации» (СБН)) pushed forward by State Duma Deputy Oksana Pushkina. This bill is far from ideal, and some commentators have argued it prioritizes preserving the family over the health and safety of women. But even this relatively modest attempt at reform triggered a negative reaction from conservative groups and institutions in Russia, including the Russian Orthodox Church. Moreover, despite the fact that the bill did not address sexual orientation issues, verbal attacks against the initiators of the bill by Sorok Sorokov and other Orthodox vigilante groups were full of homophobia and transphobia.

The researchers used Python and its tools and libraries (requests, pandas, NumPy, Matplotlib, Seaborn, Natural Language Toolkit, among others) to extract and explore the data using basic natural language processing techniques, such as tokenization, stop word removal, word counting, phrase matching, word co-occurrences, and regular expressions.

It is noteworthy that Natural Language Toolkit (NLTK), as well as spaCy and TextBlob, work almost perfectly for English but not for Russian. For this reason, while cleaning the text, the researchers used an original list of stop words in addition to those provided by NLTK. To find necessary text extracts, they used words instead of stems, because stemming with the NLTK package is not ideal for Russian. Finally, while searching for relevant sentences and text fragments using keywords, they combined different techniques to overcome the difficulties of working with the Russian language and raw social media data.

Statistical trends in the most recent 10,000 posts by Sorok Sorokov

The first step was to extract and load relevant data into pandas, a popular data analysis and manipulation tool built on top of the Python programming language.

Loading the data in pandas. (Source: Andrejs Berdnikovs)

This allowed for analysis of key performance indicators for each post, such as the number of comments, likes, and reposts.

Performance indicators for each post in the dataset. (Source: Andrejs Berdnikovs)

The vast majority of posts had a number of likes within an approximate range of 35 to 250. The average number of likes per post was approximately 168. This is a fairly normal statistic for a VK group of this size (slightly larger than 28,750 members) that disseminates non-entertaining information in the form of texts, rather than videos or pictures. For comparison, Russian-language YouTube groups of a similar size that distribute highly entertaining videos usually garner more likes.

A histogram showing the number of likes by the number of posts. (Source: Andrejs Berdnikovs)
Descriptive statistics for the likes.count column. (Source: Andrejs Berdnikovs)

At the same time, some of Sorok Sorokov’s posts aroused much greater interest: there are quite a lot of posts with likes in the range from 1,000 to 2,716 (likes.count column statistics shows that 2,716 is the maximum number of likes). Although these posts are hard to distinguish in the histogram above, a kernel density estimate plot combined with a histogram and rug plot highlights them better. Such plots are useful in visualizing the distribution of data over a continuous interval, as well as in displaying where values are concentrated over the interval. A rug plot that draws small bars along the x-axis for each point in the dataset was also added for the number of likes for each of Sorok Sorokov’s posts.

Kernel density plot showing the number of likes. (Source: Andrejs Berdnikovs)

Similar plots were created for reposts and comments, revealing the same trend.

Kernel density plots for the number of reposts (left) and the number of comments (right). (Source: Andrejs Berdnikovs)
Descriptive statistics for both columns from previous image. (Source: Andrejs Berdnikovs)

The most commented on post of Sorok Sorokov’s was about the Russian domestic violence bill.

Top 10 most commented on posts in the dataset. (Source: Andrejs Berdnikovs)
The most commented on post discussed Russia’s domestic violence bill. (Source: Andrejs Berdnikovs)

Translation of the post above:

An Important Survey! On October 21, 2019, the State Duma of the Russian Federation held public hearings on the need for a bill according to which family-household relations in the matter of crime prevention will be regulated separately. That is, it is planned to create separate institutions that regulate family relations between family members. That is, they will get the authority to monitor families and invade families under the pretext of preventing “family” violence. As a result of these hearings, a post was written in the official group of the State Duma with a survey on this issue. And all would be fine, but in a strange way this poll is anonymous. https://vk.com/wall-138347372_555472 And right there it was attacked by bot farms of the defenders of this law who are supported by Soros funds and others like them. Therefore, we have decided to deanonymize voters according to their attitude to this anti-family law and have decided to launch an open poll and invite everyone to participate in an open, rather than anonymous, survey. The Duma group admins should be ashamed of manipulating such an important institution as the family through an anonymous survey.

Initial observations suggested that the activity of Sorok Sorokov’s online community was in line with trends in other Russian ultra-right online groups. The overwhelming majority of these groups’ posts had a relatively modest, but rather stable, number of likes, reposts, and comments. A similar trend can be found for social media pages of state institutions, as well as groups that rely more on textual posts rather than engaging images, videos, or multimedia.

At the same time, some of Sorok Sorokov’s posts caused quite a reaction among its online community members: many of these more controversial posts discussed the domestic violence bill, as well as other issues, such as religion, Western identity politics, gender roles, Russia’s relations with the West, etc. Future research will pay more attention to relations and correlations between post content and community reaction, including by using machine-learning methods and the bag-of-words model.

Automated content analysis of domestic violence bill-related posts

The researchers then conducted an automated content analysis of Sorok Sorokov’s language and rhetoric regarding the Russian domestic violence bill.

In November 2019, there was a heated confrontation between Pushkina, the co-author of the bill, and Orthodox fundamentalists. This confrontation unfurled both online and offline. The Sorok Sorokov movement was especially active in bullying Pushkina and other Russian women’s rights activists. At the end of November 2019, Pushkina submitted a complaint to Russia’s interior affairs minister, asking the country’s national police agency to investigate threats she and others had received. A couple of days later, Sorok Sorokov appealed to the Russian Investigative Committee, the Federal Security Service (FSB), the Prosecutor General’s Office, and the Ministry of Internal Affairs with a call to start criminal proceedings against Pushkina and her lawyer Konstantin Dobrynin on suspicion of false denunciation and slander, as well as incitement to hatred and enmity.

The 15 most frequently used words in Sorok Sorokov’s rhetoric on the bill, displayed on the frequency distribution below, can be translated as follows: (1) November, (2) “СБН” is a Russian abbreviation for семейно-бытовое насилие (domestic violence), (3) law, (4) against, (5) family, (6) coordinates, (7) children, (8) human being, (9) violence, (10) time, (11) God, (12) Russia. Of these 12 words, three words (law, human being, and God) are identified twice, but in different declensions. Because of the Natural Language Toolkit’s imperfections for working with the Russian language, the content analysis was conducted on whole words instead of stems.

Frequency distribution of the most commonly used words. (Source: Andrejs Berdnikovs)

After identifying the most common words, the researchers used some of them as keywords to find relevant 11-word long extracts, consisting of (1) the keyword, (2) five words before the keyword, and (3) five words after the keyword. In total, 535 relevant extracts were found.

Some of the extracts found using the keyword “сбн,” a Russian-language abbreviation for семейно-бытовое насилие (domestic violence). (Source: Andrejs Berdnikovs)

Since these 11-word long text excerpts were extracted from the cleaned text, in which stop words and other non-relevant words were filtered out, their translation into English does not make sense.

From this point onward, for the convenience of an English-speaking reader, examples consist of partially cleaned text, representing full sentences that can be translated into English. It should be noted, however, that most methods in automatic content analysis are more effective if applied to fully cleaned text.

Results

During the analysis of the extracts, four major themes were identified:

1) Rhetorical militancy;

2) Manipulative rhetoric;

3) Anti-Western narrative; and

4) Sexist, homophobic, and transphobic language.

To explore these themes in more detail, the researchers relied on a two-step coding approach, updating the list of keywords to search for new text fragments related to the identified themes. Moreover, to grasp all of the nuances fully, they used a partially cleaned text that includes not only the Cyrillic letters, but also numeric characters and punctuation. This allows for the analysis of complete expressions, as well as translation of relevant text excerpts into English.

Text excerpts that match one of the four identified themes, rhetorical militancy. (Source: Andrejs Berdnikovs)

An approximate translation of these text excerpts is as follows:

‘Thanks to all those who helped the Sorok Sorokov Movement in organizing a popular action of resistance to the domestic violence law!’

‘On November 23, we will take part in the ‘For Family!’ action resisting the genocide of the Russian people!’

‘On November 23, everyone who considers family a value, must to take part in our action of resistance!’

‘More than 30 cities will also take part in one way or another in the action of resistance to the anti-family and anti-people domestic violence bill, which will destroy our families and drag Russia into Sodom’

‘Let’s go out to the streets in our cities on November 23 to take action against the adoption of this monstrous, anti-family law promoted by feminists and homosexuals, such as Alan Eroh’

‘Attention! Civil resistance rally ‘For Family!’. No genocide of the Russian people! The Sorok Sorokov Movement announces the all-Russian rally ‘For Family!’ scheduled for November 23, at 1:00 p.m.!’

‘On November 23, in all cities of Russia we should take part in actions of resistance to this monstrous and fascist domestic violence law’

‘But in the near future, we will announce a large-scale popular action of resistance to this anti-people and fascist domestic violence law’

These post fragments reveal a sharp militant rhetoric, describing the domestic violence bill as “anti-people and fascist” and calling followers to take part in the action “resisting the genocide of the Russian people.”

This confrontational stance is supported by manipulative rhetoric exploiting the same stereotypes that are routinely used by the Kremlin’s propaganda arms for popular mobilization purposes. References to World War II is a standard practice in this regard, and it is clearly expressed in some of Sorok Sorokov’s statements as, such as the statement, “we must act as a broad popular front against this liberal fascism, which wants to accomplish what neither Napoleon nor Hitler could accomplish — to destroy the RUSSIAN PEOPLE!”

An excerpt that matches the theme of manipulative rhetoric, as partially translated above. (Source: Andrejs Berdnikovs)

Sorok Sorokov’s posts are also full of anti-Western propaganda and misinformation. One excerpt alleged that “the United States [was] the author of this law.” Anti-Western paranoia also surfaced in public calls to investigate who paid for Pushkina’s long stay in the United States in the 1990s, as well as to check which “homosexual ideological organizations” she engaged with and “with whom from the U.S. intelligence agencies or other organizations engaged in subversive activities against Russia did she meet at those times or later.”

The texts also heavily featured sexist, homophobic, and transphobic language. To capture the range of relevant content in Russian media, one has to use as keywords not only standard terms such as феминистки (“feminists”), гомосексуалисты (“homosexuals”), etc., but also more widespread pejoratives, such as либерал-фашизм (“liberal fascism”), содомиты (“sodomites”), сатанисты (“Satanists”), гомосеки (“faggots”), and many other words that are not always easy to translate into English.

Future research could develop a “sexist/homophobic/transphobic glossary” (keywords, phrases, concepts, pejoratives, etc.) for the Russian language that could be used in further research on Russian media. While many similar lists exist in English, no such glossary has been developed for the Russian language yet.

Conclusions and implications

There is a significant amount of research examining the use of gender stereotypes as a basis of political legitimation in contemporary Russia. For example, arguing that gender norms are most easily invoked as tools of authority-building when there exists widespread popular acceptance of misogyny and homophobia, Valerie Sperling studied the ways in which sexism and homophobia were reflected in Russia’s public sphere. She found that masculinity played a crucial role in legitimizing Vladimir Putin’s regime. Sperling, as well as scholars such as Heleen Zorgdrager and Hanna Stähle, have also paid considerable attention to the growing role of the Russian Orthodox Church as a powerful force in producing homophobic discourse and reinforcing traditional gender roles in Russia.

This article deals with a less explored topic — sexist, homophobic, and transphobic language used by an Orthodox vigilante militia group. While a more detailed comparative study is needed, the rhetoric of these groups appears to possess some defined characteristics, while lacking others. For example, Sorok Sorokov’s online posts do not make use of the prison slang and inmate jargon that can be found even in Putin’s official addresses. In this slang’s place, Sorok Sorokov’s messages to Russian society are full of religious invocation: the word “God” is one of the most frequently used. Despite its stylistic distinctness, however, the thematic focus of Sorok Sorokov coheres with the framework of standard Kremlin propaganda.

This analysis identified four major themes in Sorok Sorokov’s online posts: rhetorical militancy, manipulative rhetoric, anti-Western narrative, and sexist, homophobic, and transphobic language. A more in-depth analysis may identify a much larger number of topics and subtopics. Advanced methods in automated content analysis provide an opportunity of doing this quickly and with a nearly infinite amount of textual data.

This project hopes to contribute to the development of Queer Data Science, “a data science that is not controlling, eliminationist, assimilatory” and that refrains from reducing humanity down to what can be counted. Queer Data Science belongs firmly within the burgeoning “Data Science for Social Good” community, and the researchers strongly believe that data should be used for public benefit, including empowering the most marginalized and vulnerable among us.

Andrejs Berdnikovs, Ph.D., is a social movement scholar, queer data science enthusiast, data analyst, and editor at the European Journalism Observatory (EJO).

Pavel Marozau is a social media activist, data analyst and founder of ARU TV.

The project is carried out in the framework of cooperation between ARU TV and research community that has successfully developed since 2017 and has focused so far on various social media-related issues, for example, on using political humor by Russian-speakers in Estonia and Latvia to resist the Kremlin’s propaganda.

--

--

@DFRLab
DFRLab

@AtlanticCouncil's Digital Forensic Research Lab. Catalyzing a global network of digital forensic researchers, following conflicts in real time.