Motion-Aware Generative Frame Interpolation

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative frame interpolation methods suffer from temporal distortions in complex motion scenarios, primarily due to weak motion modeling capacity and insufficient precision in characterizing inter-frame dynamics. To address this, we propose an explicit motion-guided framework: we first identify that intermediate-layer optical flows from a pre-trained flow estimator serve as effective task-specific motion priors; then, we design a dual-path injection paradigm—simultaneously embedding motion cues into both latent and feature spaces—to enable cross-level motion representation fusion. Our method integrates optical flow estimation, differentiable feature warping, multi-scale alignment, and joint training on both real-world and animated video domains. Extensive experiments demonstrate state-of-the-art performance across multiple benchmarks, with significant improvements in PSNR and SSIM, alongside markedly enhanced visual fidelity and motion coherence.

Technology Category

Application Category

📝 Abstract
Generative frame interpolation, empowered by large-scale pre-trained video generation models, has demonstrated remarkable advantages in complex scenes. However, existing methods heavily rely on the generative model to independently infer the correspondences between input frames, an ability that is inadequately developed during pre-training. In this work, we propose a novel framework, termed Motion-aware Generative frame interpolation (MoG), to significantly enhance the model's motion awareness by integrating explicit motion guidance. Specifically we investigate two key questions: what can serve as an effective motion guidance, and how we can seamlessly embed this guidance into the generative model. For the first question, we reveal that the intermediate flow from flow-based interpolation models could efficiently provide task-oriented motion guidance. Regarding the second, we first obtain guidance-based representations of intermediate frames by warping input frames' representations using guidance, and then integrate them into the model at both latent and feature levels. To demonstrate the versatility of our method, we train MoG on both real-world and animation datasets. Comprehensive evaluations show that our MoG significantly outperforms the existing methods in both domains, achieving superior video quality and improved fidelity.
Problem

Research questions and friction points this paper is trying to address.

Frame Interpolation
Complex Scenes
Motion Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion-aware Generation
Frame Interpolation
Video Clarity Enhancement
🔎 Similar Papers
No similar papers found.
Guozhen Zhang
Guozhen Zhang
Nanjing University
Video Frame Interpolation
Yuhan Zhu
Yuhan Zhu
Nanjing University, Shanghai AI Lab
Computer VisionVision-Language ModelsVideo Understanding
Yutao Cui
Yutao Cui
Tencent Hunyuan
Generative ModelsMulti-ModalObject Tracking
X
Xiaotong Zhao
Platform and Content Group (PCG), Tencent
K
Kai Ma
Platform and Content Group (PCG), Tencent
L
Limin Wang
State Key Laboratory for Novel Software Technology, Nanjing University; Shanghai AI Lab