Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications

📅 2024-03-05

🏛️ arXiv.org

📈 Citations: 11

✨ Influential: 1

career value

222K/year

🤖 AI Summary

This work exposes a novel zero-click worm attack threat against generative AI applications built on Retrieval-Augmented Generation (RAG) architectures: adversaries craft self-replicating adversarial prompts—dubbed “Morris-II worms”—that propagate across applications via cross-application indirect prompt injection, enabling user-free, multi-hop data exfiltration. To address this, the authors introduce the first formal Morris-II zero-click AI worm model and propose Virtual Donkey, a lightweight real-time detection framework. Virtual Donkey leverages anomalous propagation graph modeling and embedding robustness analysis to achieve high-fidelity worm interception. Evaluated in an email-assistant ecosystem, it attains a true positive rate of 1.0 and a false positive rate of 0.015, while demonstrating strong robustness against out-of-distribution (OOD) worm variants. This work establishes the first deployable defense paradigm for securing RAG systems against autonomous, zero-interaction adversarial propagation.

Technology Category

Application Category

📝 Abstract

In this paper, we show that when the communication between GenAI-powered applications relies on RAG-based inference, an attacker can initiate a computer worm-like chain reaction that we call Morris-II. This is done by crafting an adversarial self-replicating prompt that triggers a cascade of indirect prompt injections within the ecosystem and forces each affected application to perform malicious actions and compromise the RAG of additional applications. We evaluate the performance of the worm in creating a chain of confidential user data extraction within a GenAI ecosystem of GenAI-powered email assistants and analyze how the performance of the worm is affected by the size of the context, the adversarial self-replicating prompt used, the type and size of the embedding algorithm employed, and the number of hops in the propagation. Finally, we introduce the Virtual Donkey, a guardrail intended to detect and prevent the propagation of Morris-II with minimal latency, high accuracy, and a low false-positive rate. We evaluate the guardrail's performance and show that it yields a perfect true-positive rate of 1.0 with a false-positive rate of 0.015, and is robust against out-of-distribution worms, consisting of unseen jailbreaking commands, a different email dataset, and various worm usecases.

Problem

Research questions and friction points this paper is trying to address.

AI Security

Self-replicating Malware

RAG System Defense

Innovation

Methods, ideas, or system contributions that make the work stand out.

Morris-II Attack

Virtual Donkey Tool

RAG System Security

🔎 Similar Papers

No similar papers found.