We think the Russian Trolls are still out there: attempts to identify them with ML
In the last two years, evidence has emerged that accounts controlled by the Russian Internet Research Agency (IRA) are operating on Twitter, likely with the goal of interfering with American democratic processes.
With the US 2018 election on our minds, we have built machine learning (ML) tools to label accounts still operating today as possible Russian trolls. Looking simultaneously at their behavior and language use, our machine learning model examines whether accounts operate in ways highly similar to the earlier group of trolls released by 538. While this is early work, we believe based on these findings that many new Russian troll accounts are likely active on Twitter, and currently working to shape American political discourse.
Summary of quantitative findings
We have initially run the model — which we describe in some detail below — on the mentions of 11 high-profile journalists on Twitter. The model predicted 249 of the 11,843 mentioning accounts as suspected Russian trolls, an average of 25 per journalist. 10 of the 11 journalists had at least one suspected Russian troll trying to get their attention in their recent mentions. Due to noise in our prediction process (more details below), after a process of human review, we believe that 50–70% of these model-flagged accounts are highly likely to be Russian trolls (i.e., 13–18 per journalist).
Examples of ML-flagged accounts
For any given account, the ML model looks at a variety of historical behaviors associated with the account. It then outputs a probability corresponding to the model’s judgement that this account is a Russian troll.
In a striking coincidence, the New York Times yesterday ran a a story and podcast about Alexander Malkevich, the founder of the Russian website USAReally, and suspected actor in Russian propaganda online.
Our model had, of course, no knowledge of this story, but processed the account because of the reply above. The model predicted @McCevich (his account, or one purporting to be him) as a troll with a probability of 100%. We believe this is face validity for the model’s decision boundary.
This account was made in July 2017 and has tweeted over 18K tweets since then. It tweets very frequently to well-known news accounts (approximately 17 tweets/hour).
Not all accounts tweet incredibly often. @Rafael54356577, however, often repeats similar messages in reply to many different actors on Twitter.
The accounts aren’t simply bots.
Our colleagues at Indiana University have done years of work identifying bots on Twitter using statistical methods. An interesting, natural question is: “Are the accounts flagged as Russian trolls just bot accounts?” The IU researchers provide a tool to the public called Botometer that will assess an account for bot characteristics.
While the ML-flagged accounts do exhibit some bot-like characteristics, it’s not a dominant behavior. This would seem essential to both defeat bot-detection software presumably already running on Twitter, as well as provide authenticity in conversations. One working hypothesis is that the Russian trolls might be software-assisted human workers.
Next, we briefly discuss the data and models we brought to bear on this problem. Since this is early work, we are only providing an outline of the technical work.
Troll dataset. As part of the Mueller investigation, Twitter identified specific accounts controlled by the IRA; the news organization 538 obtained the list, and released data on these 2,848 Twitter accounts, all of whom were active during 2015–2017.
Distractor (“non-troll”) dataset. We assembled a set of approximately 170K “distractor accounts” to serve as counterexamples to the ones in the 538 dataset. These accounts are randomly selected Twitter accounts who had at sometime tweeted from the United States, and had tweeted at least 5 times over their lifetime on Twitter between 2012 and 2017. The list of these randomly selected accounts were provided by David Jurgens. In other words, they serve as “normal, everyday” accounts. Our model’s task is to distinguish the 2.8K known Russian troll accounts from this much larger set of 170K distractor accounts.
The model is a relatively simple logistic regression (for interpretability) built on a bag-of-words representation.
- Behavioral features: For example, the model looks at tweet frequency, the rate of retweeting, following count, etc.
- Language features: In addition, the model examines both the distribution of languages present on an account’s timeline (e.g., English + Hungarian + Italian), as well specific term use from a dictionary of 5,000 highly important terms (e.g., “Trump”, “police”, “protest”, “MAGA”). These words were drawn from data, not imposed by the researchers a priori. In total, the model learned from over 17GB of quantitative and textual data.
Cross-validation. Cross-validation is a way of partitioning a dataset into complementary subsets and training the model on one subset (training set), and validating the model on the other subset (testing set). Using 10-fold cross-validation (which means 1/10 of the dataset is used as the testing set and the remaining 9/10 are used as training data, repeated 10 times with the resulting scores averaged), we see the model is highly accurate — it predicts the correct answer 99.5% of the time. However, this isn’t the most instructive metric given the imbalance in the data.
Instead, you could ask: When the model makes a prediction of “likely Russian troll,” how often is it correct? This is known as “precision” in the field of machine learning. In our training data (the 538 dataset), the model is correct in this case 80% of the time.
Predicting out-of-sample accounts
We expected the performance shown above to go down and it did when predicting “out-of-sample” — that is, predicting on new, unseen accounts in the real world. The unseen accounts were the 11,843 mentions of 11 high-profile journalists on Twitter. The model predicted 249 of the 11,843 to be trolls. 10 of the 11 journalists had at least one suspected Russian troll trying to get their attention in their recent mentions. We believe after human review that 50–70% of these model-flagged accounts are highly likely to be Russian trolls (i.e., 13–18 per journalist). This is a large range, reflecting the early and fluid nature of this project.
This is early work, and there are many things we do not know. How widespread is the problem? Who do the suspected Russian trolls try to communicate with? What are their goals? How long do they last on the site before Twitter deactivates them, if ever? What networks are these accounts embedded in, and how does that differentiate them from other users?
We would love feedback on this work. As journalists often use tweets in their practice, we would be happy to run this model one-off on their mentions in the run-up to the election. Please feel free to contact Eric Gilbert by email (available on linked site) to answer any further questions about the work.
Eric Gilbert is the John Derby Evans Associate Professor in the School of Information, and Professor of EECS, at the University of Michigan.
David Jurgens is an Assistant Professor in the School of Information and EECS at the University of Michigan.
Libby Hemphill is an Associate Professor in the School of Information, a Research Associate Professor at ISR, and the Director of the Resource Center for Minority Data at ICPSR, all at Michigan.
Eshwar Chandrasekharan is a PhD student in the School of Interactive Computing at Georgia Tech. He is advised by Eric Gilbert.
Jane Im is a PhD student in the School of Information at the University of Michigan. She is co-advised by David Jurgens and Eric Gilbert.