Φ-Noise: Training-Free Temporal Video Conditioning via Phase-Based Noise Manipulation

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work proposes a training-free motion control method for video diffusion models that circumvents the need for additional training or substantial computational overhead typically required for temporal conditioning. By directly injecting low-frequency phase information from a reference video into the latent space of the diffusion noise, the approach enables precise control over both appearance and dynamics of the generated video without modifying the model architecture or retraining. This study introduces, for the first time, a frequency-domain phase-based mechanism for noise manipulation, significantly simplifying the control pipeline while enhancing flexibility. Experimental results demonstrate that the method achieves, and in several cases surpasses, the video generation quality and motion consistency of existing, more complex approaches across multiple tasks.

📝 Abstract

Latent video diffusion models generate videos by progressively transforming Gaussian noise into realistic samples conditioned on text or visual inputs. However, existing conditioning methods often require additional training and computational overhead. Motivated by recent findings on the importance of frequency components in generative models, we propose a simple, training-free approach for motion-conditioned video generation by injecting low-frequency phase information from a reference video directly into the diffusion noise latents. Our method transfers motion cues without modifying the model architecture or inference pipeline. Using several applications, we demonstrate effective control over both appearance and dynamics in generated videos, while achieving competitive or superior results compared to more complex conditioning approaches.

Problem

Research questions and friction points this paper is trying to address.

video diffusion models

motion conditioning

training-free

temporal video generation

noise manipulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

training-free

phase-based noise manipulation

motion-conditioned video generation

latent diffusion models

frequency components

🔎 Similar Papers

MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion

2024-10-10arXiv.orgCitations: 0