Discrete Prior-based Temporal-coherent Content Prediction for Blind Face Video Restoration

📅 2025-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address spatiotemporal inconsistency and identity instability in blind face video super-resolution caused by complex, unknown degradations, this paper proposes a discrete-prior-driven spatiotemporal-aware restoration framework. Our method jointly optimizes blind degradation modeling and content synthesis in an end-to-end manner. Key contributions include: (1) the first discrete visual and motion prior modeling mechanism; (2) a spatiotemporal-aware content prediction Transformer that integrates discrete priors to guide high-fidelity detail generation; and (3) a cross-frame motion statistics (mean/variance) modulation module that explicitly aligns generated content with the temporal distribution of real videos. Extensive experiments on both synthetic and real-world degraded datasets demonstrate state-of-the-art performance—superior PSNR, LPIPS, and face ID consistency—while significantly improving temporal coherence and perceptual realism.

Technology Category

Application Category

📝 Abstract
Blind face video restoration aims to restore high-fidelity details from videos subjected to complex and unknown degradations. This task poses a significant challenge of managing temporal heterogeneity while at the same time maintaining stable face attributes. In this paper, we introduce a Discrete Prior-based Temporal-Coherent content prediction transformer to address the challenge, and our model is referred to as DP-TempCoh. Specifically, we incorporate a spatial-temporal-aware content prediction module to synthesize high-quality content from discrete visual priors, conditioned on degraded video tokens. To further enhance the temporal coherence of the predicted content, a motion statistics modulation module is designed to adjust the content, based on discrete motion priors in terms of cross-frame mean and variance. As a result, the statistics of the predicted content can match with that of real videos over time. By performing extensive experiments, we verify the effectiveness of the design elements and demonstrate the superior performance of our DP-TempCoh in both synthetically and naturally degraded video restoration.
Problem

Research questions and friction points this paper is trying to address.

Video Restoration
Facial Detail Recovery
Temporal Coherence
Innovation

Methods, ideas, or system contributions that make the work stand out.

DP-TempCoh
Temporal-Spatial Processing
Motion Statistics Modulation
🔎 Similar Papers
No similar papers found.
L
Lianxin Xie
School of Computer Science and Engineering, South China University of Technology
B
Bingbing Zheng
School of Computer Science and Engineering, South China University of Technology
W
Wen Xue
School of Computer Science and Engineering, South China University of Technology
Y
Yunfei Zhang
School of Computer Science and Engineering, South China University of Technology
L
Le Jiang
School of Computer Science and Engineering, South China University of Technology
Ruotao Xu
Ruotao Xu
South China University of Technology
Image processingSparse codingMachine learning
S
Si Wu
School of Computer Science and Engineering, South China University of Technology, Institute of Super Robotics(Huangpu)
Hau-San Wong
Hau-San Wong
Professor, Department of Computer Science, City University of Hong Kong
Artificial IntelligenceMachine learningData miningBioinformatics