🤖 AI Summary
Existing image restoration methods either focus exclusively on facial regions while neglecting degradation in the full body and background, or fail to effectively leverage degradation cues, often yielding blurry results or artifacts. To address this, we propose Face2Scene, a two-stage framework that first employs a reference-based face restoration model to reconstruct a high-quality face and extract a degradation encoding, which is then used to generate multi-scale degradation-aware tokens. These tokens condition a diffusion model to perform single-step, high-fidelity restoration of the entire scene—including human figures and background. This work is the first to treat the restored face as a perceptual “oracle” for estimating global degradation characteristics and to enable precise control of the diffusion process via degradation-aware tokens. Extensive experiments demonstrate that our approach significantly outperforms existing methods, achieving superior restoration quality across multiple metrics while effectively suppressing artifacts.
📝 Abstract
Recent advances in image restoration have enabled high-fidelity recovery of faces from degraded inputs using reference-based face restoration models (Ref-FR). However, such methods focus solely on facial regions, neglecting degradation across the full scene, including body and background, which limits practical usability. Meanwhile, full-scene restorers often ignore degradation cues entirely, leading to underdetermined predictions and visual artifacts. In this work, we propose Face2Scene, a two-stage restoration framework that leverages the face as a perceptual oracle to estimate degradation and guide the restoration of the entire image. Given a degraded image and one or more identity references, we first apply a Ref-FR model to reconstruct high-quality facial details. From the restored-degraded face pair, we extract a face-derived degradation code that captures degradation attributes (e.g., noise, blur, compression), which is then transformed into multi-scale degradation-aware tokens. These tokens condition a diffusion model to restore the full scene in a single step, including the body and background. Extensive experiments demonstrate the superior effectiveness of the proposed method compared to state-of-the-art methods.