BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

📅 2025-04-23

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This work identifies a previously overlooked backdoor vulnerability in text-to-video (T2V) generation models, stemming from redundant spatiotemporal information—such as unprompted backgrounds or secondary objects—that attackers can exploit covertly. To address this, we propose the first adversarial backdoor attack framework tailored for T2V models. Our method employs two core strategies: (1) spatiotemporal feature composition encoding and dynamic redundancy element transformation, enabling cross-frame temporal stealthy triggering that evades frame-level spatial moderation; and (2) prompt-aligned adversarial target injection with temporal robustness optimization, ensuring generated videos remain semantically faithful to the input prompt and visually natural. Evaluated on multiple state-of-the-art T2V models, our attack achieves >92% success rate, induces negligible video degradation (FVD increase <1.5), and preserves original model performance—effectively bypassing existing frame-wise content moderation systems.

Technology Category

Application Category

📝 Abstract

Text-to-video (T2V) generative models have rapidly advanced and found widespread applications across fields like entertainment, education, and marketing. However, the adversarial vulnerabilities of these models remain rarely explored. We observe that in T2V generation tasks, the generated videos often contain substantial redundant information not explicitly specified in the text prompts, such as environmental elements, secondary objects, and additional details, providing opportunities for malicious attackers to embed hidden harmful content. Exploiting this inherent redundancy, we introduce BadVideo, the first backdoor attack framework tailored for T2V generation. Our attack focuses on designing target adversarial outputs through two key strategies: (1) Spatio-Temporal Composition, which combines different spatiotemporal features to encode malicious information; (2) Dynamic Element Transformation, which introduces transformations in redundant elements over time to convey malicious information. Based on these strategies, the attacker's malicious target seamlessly integrates with the user's textual instructions, providing high stealthiness. Moreover, by exploiting the temporal dimension of videos, our attack successfully evades traditional content moderation systems that primarily analyze spatial information within individual frames. Extensive experiments demonstrate that BadVideo achieves high attack success rates while preserving original semantics and maintaining excellent performance on clean inputs. Overall, our work reveals the adversarial vulnerability of T2V models, calling attention to potential risks and misuse. Our project page is at https://wrt2000.github.io/BadVideo2025/.

Problem

Research questions and friction points this paper is trying to address.

Explores adversarial vulnerabilities in text-to-video models

Introduces stealthy backdoor attack via spatio-temporal redundancy

Evades content moderation by exploiting temporal video dimensions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatio-Temporal Composition for hidden content

Dynamic Element Transformation over time

Exploits video redundancy for stealthy attacks

🔎 Similar Papers

Dormant: Defending against Pose-driven Human Image Animation