Motion-Aware Generative Frame Interpolation

📅 2025-01-07

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing generative frame interpolation methods suffer from temporal distortions in complex motion scenarios, primarily due to weak motion modeling capacity and insufficient precision in characterizing inter-frame dynamics. To address this, we propose an explicit motion-guided framework: we first identify that intermediate-layer optical flows from a pre-trained flow estimator serve as effective task-specific motion priors; then, we design a dual-path injection paradigm—simultaneously embedding motion cues into both latent and feature spaces—to enable cross-level motion representation fusion. Our method integrates optical flow estimation, differentiable feature warping, multi-scale alignment, and joint training on both real-world and animated video domains. Extensive experiments demonstrate state-of-the-art performance across multiple benchmarks, with significant improvements in PSNR and SSIM, alongside markedly enhanced visual fidelity and motion coherence.

Technology Category

Application Category

📝 Abstract

Generative frame interpolation, empowered by large-scale pre-trained video generation models, has demonstrated remarkable advantages in complex scenes. However, existing methods heavily rely on the generative model to independently infer the correspondences between input frames, an ability that is inadequately developed during pre-training. In this work, we propose a novel framework, termed Motion-aware Generative frame interpolation (MoG), to significantly enhance the model's motion awareness by integrating explicit motion guidance. Specifically we investigate two key questions: what can serve as an effective motion guidance, and how we can seamlessly embed this guidance into the generative model. For the first question, we reveal that the intermediate flow from flow-based interpolation models could efficiently provide task-oriented motion guidance. Regarding the second, we first obtain guidance-based representations of intermediate frames by warping input frames' representations using guidance, and then integrate them into the model at both latent and feature levels. To demonstrate the versatility of our method, we train MoG on both real-world and animation datasets. Comprehensive evaluations show that our MoG significantly outperforms the existing methods in both domains, achieving superior video quality and improved fidelity.

Problem

Research questions and friction points this paper is trying to address.

Frame Interpolation

Complex Scenes

Motion Understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Motion-aware Generation

Frame Interpolation

Video Clarity Enhancement

🔎 Similar Papers

Generalizable Implicit Motion Modeling for Video Frame Interpolation