Can GPT-3 write misinformation? Yup, it sure can

Christopher Brennan
Deepnews.ai
Published in
7 min readJun 3, 2021

Looking at the “Truth, Lies and Automation” report from Georgetown University’s Center for Security and Emerging Technology

It’s the worry that creeps in whenever people write about GPT-3: could this be used for bad?

We’ve covered the technological advances in AI text generation like those from OpenAI a lot. There are always the “oohs” and “aah” about what it can do (write a self-help blog, for instance) and then others pointing out what it can’t.

But in the background is the question of how the ability to instantaneously “write” large amounts of text based on certain prompts could change the internet in unstoppable ways. Can artificial intelligence like GPT-3 be used for something like misinformation?

The answer, according to the report “Truth, Lies and Automation” from the Center for Security and Emerging Technology at Georgetown University, is yes, it sure can be.

This week I spoke to Drew Lohn, one of the authors of the study that looks at how the AI system ranks on different types of misinformation tasks: “narrative reiteration” (repeating messages on Twitter, for example), “narrative elaboration” (writing an article with the desired message), “narrative manipulation” (rewriting an article to fit a certain slant), “narrative seeding” (creating new narratives that could become a part of a conspiracy like QAnon), “narrative wedging” (targeting members of specific groups to increase divisions in society) and “narrative persuasion” (changing people’s minds).

What GPT-3 can do well right now

As you can see in the results from the study’s table below, there are some places where advanced AI systems can already be used to push misinformation, like in “reiteration.”

Courtesy of the Center for Security and Emerging Technology

In one test, Lohn and his fellow researchers input a set of tweets and then let the machine do the rest, creating its own set of tweets promoting the denial of climate change. Even though each message is small, this is one aspect what people worry about when they worry about the “flood of AI-generated text”: that outside walled gardens of content that you need to pay for, the free version of the internet will be overrun with endless blurbs of text written by robots whose aims may be political, commercial or otherwise. It’s where automation can meet ideas like Steve Bannon’s “flood the zone with shit” or Chomsky’s flak.

Drew Lohn

Other tasks where the amount of text that the machine had to generate wasn’t very long also saw good results. It was able to create strange, nonsensical QAnon-type posts for “narrative seeding” and was able to create at least a limited amount of “narrative persuasion.”

One of the more alarming aptitudes is GPT-3’s ability to create divisive content for “narrative wedging.” It was aided by a human who helped engineer the prompts, but could then write to target different groups, including the use of racial slurs. More extreme content also appears to be where GPT-3 is most useful now. One of the overarching lessons mentioned in the Georgetown report is that “neutral headlines were often more varied and less consistent with one another, another reminder of the system’s probabilistic approach to ambiguity. Extremism, at least in the form of headlines, is a more effective way of controlling the machine.”

Where it needs help

GPT-3 shows more complicated results on other tasks where it was required to write a longer story, such as “narrative elaboration” and “narrative manipulation.” That is, at least when there are no humans involved.

For “elaboration” the machine was prompted with headlines from different outlets, The Epoch Times, Global Times and New York Times, and asked to replicate an article in the style of a given outlet. The model did not have an overwhelming accuracy at fitting the different styles, particularly for the NY Times at only 52%. Performance increased significantly when the researchers “fine-tuned” a new version (using GPT-2 rather than 3).

This to me raises questions about the ability of AI to generate text at higher levels of quality (such as in a NY Times article) instead of just generating some sort of text to fill a space. Another problem not addressed at length in the paper is the lack of logical flow that is a hallmark of GPT-3 output when it extends beyond a few paragraphs. The connection between the paragraphs it is writing can break down, which feels odd to a careful human reader.

Lohn is unsure how many misinformation actors would want to create longform pieces, or the exact number of those pieces they would really need for their aims. However, I think that one of the true risks of automated misinformation is not just the generation of a flood of crappy tweets, which a person with a certain degree of media literacy will ignore, but the generation of different sorts of content from tweets to fake “in-depth” articles that pass the smell test of a reader, sucking them into a desired worldview making it easier to feed them more misinformation. One way toward this goal would also be “narrative manipulation,” where GPT-3 was asked to summarize and rewrite an Associated Press story with different slants. However, to me at least, the output is not so much a repurposing of the information in the original story as a newly generated story (often vague) on a similar theme.

Lohn did not have an answer about the potential limitation of GPT-3 for generating quality news copy and cautioned that the distinction between the different outlets in this study was judged using CSET’s own classifier. In addition to “fine-tuning,” for more sophisticated misinformation operations, I imagine that the use of other metrics to measure the quality of an article (such as Deepnews) could create a sort of filter on the output of the model. OpenAI has already created such a system called CLIP to filter through the answers that its image creator DALL-E draws.

Another issue with GPT-3 that Lohn sees is the errors it makes when it may not fully understand the task ahead of it because of its training data. The data was compiled before the pandemic, for example, and so it does not have a firm grasp of “COVID.” Lohn also speculates that when the model is presented with a prompt that it does not immediately categorize into a “news” context, it may then create a text that is more fictional, including factually wrong information. Exactly how the categorizations happen remains unclear.

“I tried to get it to do super verifiable facts, science stuff, and it could. It could continue on with the three laws of thermodynamics, no problem. But when I tried to give it word problems to solve, for some of them it would do okay and for some of them it would frame the problem right and then get the answer wrong,” Lohn said.

What happens now?

The Georgetown study is one of the first in-depth looks we will have into how machines can start impacting the information we see and read, and I imagine that we will see other models be put through the paces of similar tests on the different “narrative” tasks.

But the other question about GPT-3 and misinformation is not just what is possible now, but who is going to be able to use these abilities going forward.

Lohn and his colleagues’ work focuses largely on the abilities of nation states, such as U.S. adversaries China and Russia, to use GPT-3 for misinformation. They would have the sort of resources to develop something, fine-tune it to certain needs, and hire people to put it into action for the tasks when it will need human intervention.

But the kind of technology that we are getting a glimpse of now is also going to become much more available. Microsoft bought exclusive access to GPT-3’s architecture and recently unveiled its first features. More commercial applications are on the way and Lohn believes that it will be difficult to create filters on exactly what sort of content someone creates once they have access to the technology.

He also cautions that there is likely to soon be open source knowledge of large language models available to those who want it. Training such a large language model is currently exceedingly difficult for most because it requires technologically sophisticated “parallelization splitting.” However, Chinese researchers are saying they will open up their tools for that splitting.

”So if you’ve got the model and the parallelization all open source, after a couple of months or years people make it easy to use, And then, anybody, well not quite anybody, but almost anybody could be able to play with it,” Lohn said.

The sort of inevitability of these tools casts a harsh spotlight on the infrastructure we have in place to share information right now and whether it needs changes to avoid helping create harm.

Text generation means that the bad guys won’t just write a few random messages and A-B test which ones get more play. They will create thousands of messages and test all of them until they are optimized for causing impact. It is up to us to use all the tools we can, from models that can spot AI-generated text to metrics for journalistic quality, to limit that impact. At the core is rethinking an internet where “relevance” is based on the idea of engagement and virality.

--

--