RunawayEvil: Jailbreaking the Image-to-Video Generative Models

📅 2025-12-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work uncovers jailbreaking security vulnerabilities in image-to-video (I2V) generative models within multimodal systems and proposes the first dynamic, self-evolving multimodal jailbreaking attack framework. Methodologically, it introduces a three-tier “policy–tactic–action” self-evolving architecture that integrates reinforcement learning for policy optimization, large language model–driven policy exploration, multimodal instruction generation, and image manipulation techniques to enable fully automated, closed-loop collaborative attacks. Its key contribution lies in pioneering the incorporation of self-evolution mechanisms into I2V security evaluation, substantially enhancing attack adaptability and cross-dataset generalization. Extensive evaluations on state-of-the-art models—including Open-Sora 2.0 and CogVideoX—demonstrate superior effectiveness: on the COCO2017 benchmark, the proposed framework achieves a 58.5%–79% improvement in attack success rate over existing methods, establishing new state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Image-to-Video (I2V) generation synthesizes dynamic visual content from image and text inputs, providing significant creative control. However, the security of such multimodal systems, particularly their vulnerability to jailbreak attacks, remains critically underexplored. To bridge this gap, we propose RunawayEvil, the first multimodal jailbreak framework for I2V models with dynamic evolutionary capability. Built on a "Strategy-Tactic-Action" paradigm, our framework exhibits self-amplifying attack through three core components: (1) Strategy-Aware Command Unit that enables the attack to self-evolve its strategies through reinforcement learning-driven strategy customization and LLM-based strategy exploration; (2) Multimodal Tactical Planning Unit that generates coordinated text jailbreak instructions and image tampering guidelines based on the selected strategies; (3) Tactical Action Unit that executes and evaluates the multimodal coordinated attacks. This self-evolving architecture allows the framework to continuously adapt and intensify its attack strategies without human intervention. Extensive experiments demonstrate RunawayEvil achieves state-of-the-art attack success rates on commercial I2V models, such as Open-Sora 2.0 and CogVideoX. Specifically, RunawayEvil outperforms existing methods by 58.5 to 79 percent on COCO2017. This work provides a critical tool for vulnerability analysis of I2V models, thereby laying a foundation for more robust video generation systems.
Problem

Research questions and friction points this paper is trying to address.

Proposes a multimodal jailbreak framework for I2V models
Addresses security vulnerabilities in image-to-video generation systems
Enables self-evolving attacks without human intervention
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-evolving jailbreak framework with dynamic evolutionary capability
Strategy-Tactic-Action paradigm enabling self-amplifying multimodal attacks
Reinforcement learning and LLM-driven strategy customization for continuous adaptation
🔎 Similar Papers
No similar papers found.
S
Songping Wang
PRLab, Nanjing University
R
Rufan Qian
PRLab, Nanjing University
Y
Yueming Lyu
PRLab, Nanjing University
Q
Qinglong Liu
PRLab, Nanjing University
L
Linzhuang Zou
PRLab, Nanjing University
Jie Qin
Jie Qin
Professor, Nanjing University of Aeronautics and Astronautics
Computer VisionMachine LearningPattern Recognition
Songhua Liu
Songhua Liu
Shanghai Jiao Tong University
Computer VisionMachine Learning
Caifeng Shan
Caifeng Shan
Philips Research
Computer VisionPattern RecognitionMachine LearningImage/Video Analysis