DVFace: Spatio-Temporal Dual-Prior Diffusion for Video Face Restoration

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the challenge in video face restoration of simultaneously achieving fine-grained detail fidelity, identity preservation, and temporal coherence while maintaining computational efficiency. To this end, the authors propose a single-step diffusion-based generative framework that introduces a novel spatiotemporal dual-codebook to extract complementary prior information. An asymmetric fusion module is further incorporated to effectively integrate these priors within a unified generation step, enabling high-fidelity outputs with consistent identity and smooth temporal dynamics. Extensive experiments demonstrate that the proposed method significantly outperforms existing approaches across multiple benchmarks, achieving state-of-the-art performance in restoration quality, temporal consistency, and identity retention, all while offering superior inference efficiency.

Technology Category

Application Category

📝 Abstract

Video face restoration aims to enhance degraded face videos into high-quality results with realistic facial details, stable identity, and temporal coherence. Recent diffusion-based methods have brought strong generative priors to restoration and enabled more realistic detail synthesis. However, existing approaches for face videos still rely heavily on generic diffusion priors and multi-step sampling, which limit both facial adaptation and inference efficiency. These limitations motivate the use of one-step diffusion for video face restoration, yet achieving faithful facial recovery alongside temporally stable outputs remains challenging. In this paper, we propose, DVFace, a one-step diffusion framework for real-world video face restoration. Specifically, we introduce a spatio-temporal dual-codebook design to extract complementary spatial and temporal facial priors from degraded videos. We further propose an asymmetric spatio-temporal fusion module to inject these priors into the diffusion backbone according to their distinct roles. Evaluation on various benchmarks shows that DVFace delivers superior restoration quality, temporal consistency, and identity preservation compared to recent methods. Code: https://github.com/zhengchen1999/DVFace.

Problem

Research questions and friction points this paper is trying to address.

video face restoration

temporal coherence

identity preservation

one-step diffusion

facial detail synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

one-step diffusion

spatio-temporal dual-codebook

asymmetric fusion