The First AI Worm Morris II: Exploiting Generative AI Ecosystems with Adversarial Self-Replicating Prompts
This paper introduces Morris II, a worm designed to target Generative AI (GenAI) ecosystems by using adversarial self-replicating prompts. The study demonstrates that attackers can insert prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication) and engage in malicious activities (payload). These inputs also compel the agent to deliver them to new agents by exploiting the connectivity within the GenAI ecosystem. Morris II is tested against GenAI-powered email assistants for spamming and exfiltrating personal data, under black-box and white-box accesses, using text and images as input data. The worm is evaluated against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), considering factors such as propagation rate, replication, and malicious activity.
Introduction:
Generative Artificial Intelligence (GenAI) stands at the forefront of artificial intelligence advancements, revolutionizing content creation through its autonomous generation capabilities. Powered by sophisticated machine learning methodologies, often relying on deep neural networks, GenAI can process and produce an extensive range of content forms, including text, images, audio, and videos. This versatility has propelled its integration across diverse industries, from creative arts to chatbots and finance. Companies are keen to harness GenAI’s ability to create realistic and contextually relevant outputs, leading to its seamless integration into existing products and platforms. This integration aims to automate content generation, reduce user interactions, and streamline complex tasks, enhancing efficiency and user experience.
The widespread adoption of GenAI has given rise to ecosystems of GenAI-powered agents, both semi and fully autonomous. These agents interact with remote or local GenAI services, acquiring advanced AI capabilities crucial for contextual understanding and decision-making with minimal or no user intervention. This evolution presents new challenges, particularly in ensuring the security and responsible integration of GenAI capabilities in real-world scenarios.
Security risks associated with GenAI-powered agents have become a growing concern. Various research studies have highlighted potential attack vectors targeting GenAI models, such as dialog poisoning, jailbreaking, and privacy-leakage attacks. Dialog poisoning attacks aim to manipulate GenAI models to orchestrate phishing and spread false information. Jailbreaking attacks seek to bypass built-in safeguards, while privacy-leakage attacks intend to expose training data or prompts. Understanding these risks is crucial as GenAI becomes more deeply integrated into everyday applications, transforming personal assistants and email applications into interconnected networks of semi or fully automated GenAI-powered agents.
In response to these challenges, researchers have developed Morris II, a zero-click worm designed to target GenAI ecosystems. Named after the original Morris Worm, which appeared 36 years ago, Morris II replicates by exploiting GenAI services used by GenAI-powered agents, propagating to new agents within the ecosystem. This worm can be used for various malicious activities, including spamming users, spreading propaganda, exfiltrating personal data, and executing phishing attacks. Additionally, researchers have explored the concept of adversarial self-replicating prompts, which behave as worms within GenAI ecosystems. These prompts trigger GenAI models to output malicious content, replicate themselves, and propagate to new hosts, posing significant security risks to GenAI-powered applications.
Addressing these security risks requires a comprehensive understanding of the GenAI ecosystem’s vulnerabilities and the development of robust security measures. As GenAI continues to evolve and integrate into various industries, ensuring the security and responsible deployment of GenAI-powered agents will be paramount. Collaborative efforts between researchers, developers, and policymakers will be essential in mitigating these risks and fostering the safe and ethical integration of GenAI capabilities into our daily lives.
About this AI worm:
Computer worms are a type of malware that spreads independently across computer networks, exploiting vulnerabilities in operating systems, network protocols, or applications. Unlike viruses, worms do not need a host program to attach themselves to. Once a computer is infected, the worm can create copies of itself and distribute them to other connected systems, rapidly expanding its reach. Worms can carry malicious payloads, such as deleting files, encrypting files, stealing sensitive information, or performing DoS attacks. Their ability to autonomously spread makes them challenging to contain.
The evolution of cyber threats has been shaped by computer worms, dating back to the Creeper worm in the 1970s, the first instance of self-replicating malware. The Morris Worm in 1988 highlighted the potential for widespread damage. Subsequent decades saw the rise of sophisticated worms targeting various systems, such as the ILOVEYOU worm in 2000, the Stuxnet worm in 2010, the Mirai worm in 2016, and the WannaCry worm in 2017, causing significant financial losses and affecting PCs, servers, laptops, IoT devices, and industrial control systems.
Recent research has focused on attacks against GenAI models, investigating attack vectors, outcomes of attacks, and types of inputs that can be exploited. These studies have shown methods to inject prompts directly and indirectly into GenAI models, jailbreak the models, leak training data or prompts, and poison dialogues with users. Our work introduces the concept of a GenAI Worm, specifically Morris II, the first malware for GenAI-powered applications and ecosystems. This differs from previous studies that mainly focused on security and privacy risks associated with GenAI models.
you will get to know more aboout this in research paper:https://arxiv.org/pdf/2403.02817.pdf