New Study by Global Pulse Highlights Risks of AI-Generated Texts, Creates Fake UN Speeches

Pulse Lab Jakarta
Jun 25 · 3 min read

Automated text generation is being broadly applied in many domains, from marketing to robotics, and used to create chatbots, product reviews and even to write poetry. The ability to synthesize text, however, presents many potential risks, while access to the technology required to build generative models is becoming increasingly easy.

To build the speech generator, Global Pulse researchers first created a taxonomy for the machine learning algorithms using English language transcripts of speeches given by high level representatives at the UN General Assembly between 1970 and 2015. The goal was to train a language model that could be used to generate text on topics ranging from general issues such as climate change, to UN Secretary-General’s remarks, to inflammatory and discriminating speech.

“We used ‘off-the-shelf’ techniques to prove the ease with which such a powerful model can be created. The model is a type of Recurrent Neural Network (RNN) commonly used in situations when you want to predict the next element in a sequence, given previous elements. Being a ‘neural network’ means that it ‘learns’ how to make such predictions; the more examples you can give it the better it becomes,” said Joseph Bullock, AI researcher at UN Global Pulse.

The AI model was trained in 13 hours at a cost of $7.80 paid for cloud computing resources. To generate text, it was seeded with the beginning of a sentence (these ‘seeds’ are highlighted in the examples below in bold).

The study showed that for general political topics, the model was able to match the style and cadence of real UN speeches most of the time. This type of AI generated paragraphs could easily be made indistinguishable from an official speech with minimal modifications by a person.

The model performed less accurately, around half of the time, when producing inflammatory remarks, like talk of immigration or racism. This could be attributed to the formal nature of the dataset, and the lack of inflammatory language.

The study was intended to raise awareness about the dangers of AI text generation to peace and political stability, and to suggest recommendations for those in the scientific and policy spheres working to address these challenges.

“With this study, we wanted to bring attention to the availability of AI technology that can be used to spread disinformation, impersonate, or even write hateful and politically inflammatory speech. As a society, we need to establish safeguards against these threats at multiple levels, starting with increased awareness of the risks,” said Dr. Miguel Luengo-Oroz, chief data scientist at UN Global Pulse. “We need to develop technological solutions that can assess the veracity of human communication, and we need to create laws and regulations to prevent threats to human rights.”

The study will be presented at the AI for Social Good workshop on 15 June, 2019 during the International Conference on Machine Learning (ICML) taking place in California, USA.

Download the study here.


Pulse Lab Jakarta

Harnessing data for development, translating insights for social innovation

Pulse Lab Jakarta

Written by

Harnessing data for development, translating insights for social innovation

Pulse Lab Jakarta

Harnessing data for development, translating insights for social innovation