Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

πŸ“… 2024-10-05
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 3
✨ Influential: 1
πŸ“„ PDF
πŸ€– AI Summary
To address spurious attribute generation and identity distortion in low-quality facial image restoration under real-world conditions, this paper proposes a multimodal controllable reconstruction framework. Methodologically, it introduces a novel dual-control adapter architecture coupled with a two-stage curriculum learning strategy, integrating attribute text prompts, high-quality reference images, and explicit identity constraints; it further incorporates negative quality prompting and fine-grained attribute modulation. We construct Reface-HQβ€”the first large-scale, high-resolution facial benchmark (21K+ samples)β€”designed to support cross-modal alignment and identity-aware reconstruction. Extensive experiments demonstrate that our method significantly improves detail recovery and identity fidelity under severe degradation, achieving superior visual quality over current state-of-the-art methods while enabling controllable, precise, and perceptually realistic facial reconstruction.

Technology Category

Application Category

πŸ“ Abstract
We introduce a novel Multi-modal Guided Real-World Face Restoration (MGFR) technique designed to improve the quality of facial image restoration from low-quality inputs. Leveraging a blend of attribute text prompts, high-quality reference images, and identity information, MGFR can mitigate the generation of false facial attributes and identities often associated with generative face restoration methods. By incorporating a dual-control adapter and a two-stage training strategy, our method effectively utilizes multi-modal prior information for targeted restoration tasks. We also present the Reface-HQ dataset, comprising over 21,000 high-resolution facial images across 4800 identities, to address the need for reference face training images. Our approach achieves superior visual quality in restoring facial details under severe degradation and allows for controlled restoration processes, enhancing the accuracy of identity preservation and attribute correction. Including negative quality samples and attribute prompts in the training further refines the model's ability to generate detailed and perceptually accurate images.
Problem

Research questions and friction points this paper is trying to address.

Improve facial image restoration from low-quality inputs
Mitigate false facial attributes and identities generation
Enhance identity preservation and attribute correction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal guided diffusion model
Dual-control adapter strategy
Two-stage training approach
πŸ”Ž Similar Papers
No similar papers found.