TEAR: Temporal-aware Automated Red-teaming for Text-to-Video Models

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing safety evaluation methods for text-to-video (T2V) models overlook temporal dynamics, failing to expose latent policy violations during sequential frame generation. Method: We propose the first temporal-aware red-teaming framework for video generation, featuring a two-stage mechanism: (1) temporal-sensitive prompt initialization to generate ostensibly benign yet temporally exploitative inputs, and (2) iterative refinement via online preference learning–guided fine-tuning to enhance attack stealth and efficacy. Contribution/Results: This work pioneers the integration of explicit temporal modeling into T2V safety assessment and introduces online preference learning for adaptive, covert adversarial prompting. Experiments demonstrate >80% attack success rates across both open-source and commercial T2V systems—substantially outperforming prior state-of-the-art approaches—while revealing previously undetected temporal vulnerabilities in frame coherence and content evolution.

Technology Category

Application Category

📝 Abstract
Text-to-Video (T2V) models are capable of synthesizing high-quality, temporally coherent dynamic video content, but the diverse generation also inherently introduces critical safety challenges. Existing safety evaluation methods,which focus on static image and text generation, are insufficient to capture the complex temporal dynamics in video generation. To address this, we propose a TEmporal-aware Automated Red-teaming framework, named TEAR, an automated framework designed to uncover safety risks specifically linked to the dynamic temporal sequencing of T2V models. TEAR employs a temporal-aware test generator optimized via a two-stage approach: initial generator training and temporal-aware online preference learning, to craft textually innocuous prompts that exploit temporal dynamics to elicit policy-violating video output. And a refine model is adopted to improve the prompt stealthiness and adversarial effectiveness cyclically. Extensive experimental evaluation demonstrates the effectiveness of TEAR across open-source and commercial T2V systems with over 80% attack success rate, a significant boost from prior best result of 57%.
Problem

Research questions and friction points this paper is trying to address.

Automated safety testing for text-to-video model temporal dynamics
Uncovering policy-violating video outputs through temporal sequencing
Improving adversarial prompt effectiveness against video generation systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal-aware automated red-teaming for video models
Two-stage optimization with training and preference learning
Cyclical refinement for stealthy adversarial prompt generation
🔎 Similar Papers
No similar papers found.