🤖 AI Summary
This work addresses the challenging problem of single-image 3D face reconstruction for subjects wearing glasses, where occlusion by eyewear severely degrades geometric and textural fidelity. We propose the first end-to-end framework that jointly performs unsupervised glasses removal and high-fidelity 3D geometry and texture recovery—without requiring paired data or ground-truth occlusion masks. Methodologically, we introduce a generative keypoint-guided mechanism to decouple glasses-region modeling from facial geometry learning; integrate implicit neural representations (INRs) with differentiable rendering; and incorporate adversarial keypoint prediction, glasses mask distillation, and multi-scale geometric regularization. Evaluated on standard benchmarks including NoW and MICC, our approach achieves state-of-the-art performance—reducing normalized mean error (NME) by 18.7%—with significantly improved reconstruction accuracy and visual realism over prior methods, demonstrating strong generalization capability.