🤖 AI Summary
Existing facial super-resolution (FSR) methods suffer from fixed-scale constraints and input-size sensitivity, hindering arbitrary-scale reconstruction. To address this, we propose an implicit neural representation-based framework for scale-arbitrary FSR. Our approach mitigates spectral bias via local frequency estimation, incorporates facial structural priors through global coordinate modulation, and jointly models 2D deep features, local relative coordinates, and continuous scale ratios to predict pixel-wise RGB values. Crucially, it achieves the first end-to-end, fully decoupled adaptive reconstruction—where input resolution and scaling factor are independently handled. Extensive experiments under multi-scale and multi-resolution settings demonstrate significant improvements over state-of-the-art methods, with PSNR and SSIM gains of +1.23 dB and +0.021, respectively. Qualitative and quantitative evaluations consistently validate superior fidelity, geometric consistency, and generalization across diverse scales.
📝 Abstract
Face super-resolution (FSR) is a critical technique for enhancing low-resolution facial images and has significant implications for face-related tasks. However, existing FSR methods are limited by fixed up-sampling scales and sensitivity to input size variations. To address these limitations, this paper introduces an Arbitrary-Resolution and Arbitrary-Scale FSR method with implicit representation networks (ARASFSR), featuring three novel designs. First, ARASFSR employs 2D deep features, local relative coordinates, and up-sampling scale ratios to predict RGB values for each target pixel, allowing super-resolution at any up-sampling scale. Second, a local frequency estimation module captures high-frequency facial texture information to reduce the spectral bias effect. Lastly, a global coordinate modulation module guides FSR to leverage prior facial structure knowledge and achieve resolution adaptation effectively. Quantitative and qualitative evaluations demonstrate the robustness of ARASFSR over existing state-of-the-art methods while super-resolving facial images across various input sizes and up-sampling scales.