RDPO: Real Data Preference Optimization for Physics Consistency Video Generation

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Video generation has advanced significantly in visual fidelity, yet physical consistency remains severely lacking. Existing preference-based optimization methods rely either on costly human annotations or unreliable reward models. To address this, we propose RDPO—a label-free preference optimization framework. RDPO leverages reverse sampling from a pre-trained generator on real-world videos to construct preference pairs discriminable by physical correctness. Through statistical distinguishability analysis and multi-stage iterative training, it unsupervisedly distills physical priors from authentic dynamic data, enhancing motion coherence and physical plausibility. RDPO is the first method enabling physics-aware preference learning without human annotations—requiring only raw real videos. Extensive evaluations across multiple benchmarks and human studies demonstrate that RDPO substantially improves both physical reasonableness and temporal consistency of generated videos, validating its effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract
Video generation techniques have achieved remarkable advancements in visual quality, yet faithfully reproducing real-world physics remains elusive. Preference-based model post-training may improve physical consistency, but requires costly human-annotated datasets or reward models that are not yet feasible. To address these challenges, we present Real Data Preference Optimisation (RDPO), an annotation-free framework that distills physical priors directly from real-world videos. Specifically, the proposed RDPO reverse-samples real video sequences with a pre-trained generator to automatically build preference pairs that are statistically distinguishable in terms of physical correctness. A multi-stage iterative training schedule then guides the generator to obey physical laws increasingly well. Benefiting from the dynamic information explored from real videos, our proposed RDPO significantly improves the action coherence and physical realism of the generated videos. Evaluations on multiple benchmarks and human evaluations have demonstrated that RDPO achieves improvements across multiple dimensions. The source code and demonstration of this paper are available at: https://wwenxu.github.io/RDPO/
Problem

Research questions and friction points this paper is trying to address.

Improving physical consistency in video generation
Eliminating need for costly human-annotated datasets
Enhancing action coherence and physical realism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Annotation-free framework distills physical priors
Reverse-samples real videos for preference pairs
Multi-stage training enhances physical realism
🔎 Similar Papers
No similar papers found.
W
Wenxu Qian
Fudan University, Shopee Inc.
Chaoyue Wang
Chaoyue Wang
Artificial Intelligence Generated Content (AIGC)
deep learningcomputer visionadversarial learning
H
Hou Peng
Shopee Inc.
Z
Zhiyu Tan
Fudan University
H
Hao Li
Fudan University
A
Anxiang Zeng
Shopee Inc.