Data-Driven Propaganda as a Subset of Adversarial Examples

When Chaos Isn’t Merely Chaos

Matt Brockman
May 27, 2018 · 10 min read

During the 2016 Presidential Election, the Russians conducted an information warfare attack against the American population. That’s not debated; Facebook has the receipts and Mueller has the indictments. What is debated is whether or not the propaganda campaign had an impact on the overall vote and the resulting policy decisions.

One of the barriers to having a meaningful discussion about the impact of the 2016 propaganda is that we have no framework for how to evaluate the news before or after the propaganda campaign. We know we had a bunch of right-wing, left-wing, and centrist news organizations beforehand (along with our friends on social media), and we have a bunch of right-wing, left-wing, and centrist news organizations afterward (along with our friends on social media). The situations before and afterward look the same.

This correlates to a relatively new area of research in machine learning called adversarial examples. Adversarial examples deal with problems for machine learning (a subset of artificial intelligence) where a computer can correctly interpret data under normal conditions, but an attacker can make slight purposeful changes to the input data making the computer interpret the data differently. Likewise, with the elections, the propaganda attempted to make purposeful changes to certain individuals with the possibility of making overall changes to the outcome of the elections.

Understanding the nature of the problem is the first step to formulating a real discussion. There’s no silver bullet that will solve this, and we’ve got about five months until the next election.

Creating a Model of the World

Back in 1945, the Nobel Prize winning economist Friedrich Hayek in The Use of Knowledge in Society (1945) looked at how knowledge informs decision making. Hayek argued that there’s a lot of knowledge in the world, but it’s distributed across individuals. In other words, there’s a reason doctors go to school for years for their medical degrees, historians for their degrees, mathematicians for their degrees and so forth. That knowledge is specialized and most people only know a little bit outside their specialty.

We can think of the total amount of things to know about in the world as a large matrix of k items, where k is the total amount of things that people know. For instance, that it rained in some city on a certain day is a bit of knowledge, the fact that the earth is round is another bit of knowledge. We can model this as a finite array of k bits of knowledge:

Of course, we’re also wrong about things. Another way to do this is to look at peoples’ beliefs, not just knowledge. We can do the same thing:

We can generate this for everyone’s beliefs. Accessing those beliefs can be hard. Luckily, large advertisers and data brokers collect all this data. If you can get ahold of this data, you can can create really large matrices of individuals’ beliefs. These matrices might look a little something like this:

Facebook uses this sort of data to train models to predict user behavior for advertisers. Netflix uses this data to train models to suggest better films for you to watch. The government uses this data to predict if you’re going to commit a crime or not. It has lots of uses!

In a democracy, one of the areas where these individual beliefs come into play is in elections. The government doesn’t collect all of peoples’ beliefs, just a subset. People all get together, write down on a piece of paper what their preferences are, and then all that data ends up deciding who the representatives of the populations are going to be. These representatives then make policy as a result of way those preferences interact to elect the representatives. The system just takes the preferences of the population as its input and spits out the policies as its output.

People try to modify the results of the system all the time. There are legitimate ways to influence governments actions by changing the beliefs of individuals. People use political action committees to raise money to influence elections and future policy. They use lobbyists to persuade elected politicians and help produce legislation. Republicans and Democrats alike use population data to try to prod voters to the polls. Sometimes these efforts work, sometimes they don’t.

But there are less legitimate ways to manipulate the process. For instance, in the 2016 Elections, the Russians used fake accounts to spread both ads and organic content with targeted messages to potential voters. This sort of propaganda from the 2016 Election is called black propaganda. It is called so because the people using the propaganda hide their identities and do not identify the propaganda as such.

Looking at the Flow of Information in Elections

In theory, voters select representatives in an election by deciding who will best represent their interests. The representatives make a bunch of policy decisions, which results in a new state of the world. After some amount of time, the voters then vote for their representatives again.

Voters vote for policy makers based on their preferences and the state of the world, then the policy maker makes policy that changes the state of the world, and process continues.

One problem is that people don’t directly experience the world; they interpret the world from sources like the Internet and social interactions. The propagandist can attempt to use information to change peoples’ perceptions of the state of the world. They can do this through a variety of means, including computational propaganda (using bots on social media), fake news, leaked documents, rhetoric, and so forth. This creates a new perceived state of the world that flips the preferences of certain voters, which while individually insignificant, can lead to different policy decisions.

Perceptions of the world are modified by propaganda creating new perceptions of the world which are used by voters to elect policy makers who create policy.

We can see that propaganda tries to impact our beliefs. We don’t know how. When I see one of the Facebook ads that Congress posted from the 2016 Elections, I have no idea how that influences voters. Sure, I can say that it looks like it’s causing divisions in some population, but I don’t understand how the overall sum of the ads (much less the organic content or influence from other vectors) actually impacts the overall system.

We have not developed tools to understand the impact of the propaganda. Luckily, there’s some ongoing research on an analogous problem in machine learning: adversarial examples.

Adversarial Examples

The first use of the term ‘adversarial examples’ occurred in a 2013 paper by researchers at Google, NYU, and the University of Montreal called Intriguing properties of neural networks (Szegedy et al., 2014). Szegedy et al. were playing with a form of machine learning called neural networks. Neural networks take in a matrix, do a bunch of matrix manipulations, and spit out an answer. It turns out, these neural networks can do a really good job of identifying, or in computer science terms, classifying, things like images.

When we have a digital image, the image ends up being represented as a big matrix on your screen, where each pixel is some combination of Red, Blue, and Green (You may have heard the term ‘RGB’ before when talking about monitors or your TV screen).

This can be represented as a matrix of RGB values, like this:

Example RGB (Red, Green, Blue) values for coordinates on a screen

What Szegedy et al. found (see the image below) was that they could build a machine learning classifier that did a really good job of classifying (or identifying) images (left column). However, then they could change some of the pixels (changed pixels in the center column) resulting in an image that was pretty much indistinguishable to a human from the original image (right column).

Except the computer would start mis-classifying the image after the change.

Images on the left, properly identified by a neural network. Images on the right, modified by adding the pixels from the middle column, were improperly identified by the neural network. From Szegedy et al., (2014).

These adversarial examples are problematic for machine learning. From a safety perspective, if a prankster can modify road signs, they can make self-driving cars start behaving oddly without people understanding why. From a financial perspective, actors can try to use adversarial examples to trick financial trading algorithms into making certain trades.

Now, at first this may seem that this is a problem limited to computers. After all, the modifications to the images above likely aren’t tricking you, the reader.

However, this is the same process we saw above with propaganda. There was an initial state of the voter preferences, propaganda changing some of those preferences, a modified state of voter preferences, selection of representatives based on the modified preferences, and finally policy decisions being made.

Data-Driven Propaganda as a Subset of Adversarial Examples

What we saw with adversarial examples is that the attacker just needs to identify some specific elements of the overall matrix to change in order for the system to behave differently. The Russians and other actors have the data to model populations based on the information collected from data breaches that have been occurring for the past decade.

When a propagandist is trying to covertly change the behavior of a population, we have a situation which is analogous to the problem of adversarial examples. Using population data, the propagandist identifies key beliefs to change within the population that will result in changes to the overall behavior of the population without having to change the majority of individual beliefs. It is understandable that it looks like random, chaotic messaging that merely amplifies divisions, but it is targeted towards creating real policy changes the same way the targeted pixel modifications changed the algorithm’s classification.

One problem for talking abut the impact of all of this is we don’t know the full state of beliefs prior to or after the propaganda, so we don’t even have a place to start. Another problem is that even if we had those states, views evolve over time as the world changes, so it’s difficult to pick out propaganda effects from non-propaganda effects. Finally, it’s not just the Russians trying to influence beliefs — political campaigns, interest groups, and foreign actors are all trying to influence beliefs, so picking out specifically *Russian* propaganda effects from the rest is a difficult problem in and of itself.

One difference between the way that we understand adversarial examples with images and with voter preferences is that it is easier to generate examples for images. With image recognition neural networks, attackers can run examples through neural networks to optimize the minimal changes to images required to result in a misclassification (There’s a bit of math involved, see Towards Evaluating the Robustness of Neural Networks, 2016, by Carlini and Wagner from Berkeley for some different ways of doing that).

With propaganda, it’s a bit harder to generate the examples. The ease of generating adversarial examples for images may just be because people generally don’t put a lot of effort into undermining democracy, or if they do so they don’t publish. There is propaganda that simply doesn’t do a very good job of changing peoples’ minds — for instance, I could go and create a fake ad campaign right now and it probably wouldn’t change any preferences unless I get lucky. Even then, whatever views I shifted probably wouldn’t shift policy. That would be the equivalent of changing a few pixels on a picture and hoping it tricks a computer.

What this is specifically referring to is to propaganda campaigns driven by data, where the propagandist has the set of preferences of a population, predicts future policy decisions, and then uses messaging generated from the population data to selectively shift enough views so that the predicted policy decisions are changed.

This requires a predictive model of the population and interaction between population preferences and policy. In this case, the shifting of a single preference in the population is the equivalent to changing a single pixel on an image. If too many preferences get changed, policy merely reflect the desires of the population; if too many pixels get changed, the image itself is changed and is no longer misclassified. More importantly, once the propagandist starts to try manipulating large amounts of preferences, the propagandist has to start competing with campaigns and political action committees who are throwing around a lot of money to persuade people to their own beliefs.

Adversarial examples are an unsolved attack that can have real world effects on complex systems. We don’t want attackers to be able to trick our trading algorithms or self-driving cars. Likewise, we don’t want attackers to be able to abuse our democratic processes and create unwanted policies. Identifying the problem is only the first step to having a meaningful discussion about it.


An online platform for thought-provoking, critical, and contextual articles on politics, society, and policy.

Matt Brockman

Written by

Grad student computationally studying framing in the media.



An online platform for thought-provoking, critical, and contextual articles on politics, society, and policy.