Geometry-Aware Video Inpainting for Joint Headset Occlusion Removal and Face Reconstruction in Social XR

📅 2025-08-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Head-mounted displays (HMDs) occlude the upper face, severely degrading video-based facial expression and gaze estimation in social XR and impairing visual communication quality. To address this, we propose the first geometry-aware, single-view RGB video inpainting framework that jointly removes HMD occlusions and reconstructs complete 3D facial geometry. Our method innovatively unifies video inpainting with 3D face modeling: dense facial landmarks serve as geometric priors to guide inpainting; a per-frame unoccluded reference mechanism ensures identity consistency; and SynergyNet regresses 3D Morphable Model (3DMM) parameters while a GAN-based network synthesizes photorealistic textures—all optimized end-to-end via geometry-aware losses. Experiments demonstrate robustness across varying landmark densities, significantly improving inpainting fidelity and 3D geometric accuracy. The approach enhances immersion and interaction naturalness in social XR applications.

Technology Category

Application Category

📝 Abstract
Head-mounted displays (HMDs) are essential for experiencing extended reality (XR) environments and observing virtual content. However, they obscure the upper part of the user's face, complicating external video recording and significantly impacting social XR applications such as teleconferencing, where facial expressions and eye gaze details are crucial for creating an immersive experience. This study introduces a geometry-aware learning-based framework to jointly remove HMD occlusions and reconstruct complete 3D facial geometry from RGB frames captured from a single viewpoint. The method integrates a GAN-based video inpainting network, guided by dense facial landmarks and a single occlusion-free reference frame, to restore missing facial regions while preserving identity. Subsequently, a SynergyNet-based module regresses 3D Morphable Model (3DMM) parameters from the inpainted frames, enabling accurate 3D face reconstruction. Dense landmark optimization is incorporated throughout the pipeline to improve both the inpainting quality and the fidelity of the recovered geometry. Experimental results demonstrate that the proposed framework can successfully remove HMDs from RGB facial videos while maintaining facial identity and realism, producing photorealistic 3D face geometry outputs. Ablation studies further show that the framework remains robust across different landmark densities, with only minor quality degradation under sparse landmark configurations.
Problem

Research questions and friction points this paper is trying to address.

Remove HMD occlusion in facial videos for social XR
Reconstruct complete 3D facial geometry from RGB frames
Preserve facial identity and realism during inpainting
Innovation

Methods, ideas, or system contributions that make the work stand out.

GAN-based video inpainting for occlusion removal
3D face reconstruction using SynergyNet and 3DMM
Dense landmark optimization enhances inpainting and geometry
🔎 Similar Papers
No similar papers found.
F
Fatemeh Ghorbani Lohesara
Technische Universität Berlin, Communication Systems Group, Department of Telecommunication Systems, Einsteinufer 17, Berlin, Germany, 10587
K
Karen Eguiazarian
Tampere University, Computational Imaging Group, Department of Computing Sciences, Korkeakoulunkatu 10, Tampere, Finland, 33720
Sebastian Knorr
Sebastian Knorr
Professor at HTW Berlin
Computer Vision3D Image ProcessingNeural RenderingFree-viewpoint-VideoVisual Attention