GRMM: Real-Time High-Fidelity Gaussian Morphable Head Model with Learned Residuals

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional 3D Morphable Models (3DMMs) suffer from limited resolution and insufficient geometric detail, while neural volumetric methods struggle with real-time rendering. Existing Gaussian splatting-based facial models still rely on mesh-based 3DMM priors, hindering fine-grained expression modeling, high-fidelity geometry reconstruction, and full-head (including hair) dynamic synthesis. Method: We propose the first full-head, high-fidelity deformable Gaussian model. Our approach introduces an identity-expression disentangled residual learning framework atop a base 3DMM, jointly optimizing geometric and appearance residuals. We curate the EXPRESS-50 dataset to support high-precision residual learning, and design a dual-decoder architecture—comprising base and refinement decoders—to generate vertex-level deformations and per-Gaussian appearance parameters, augmented by a lightweight CNN for enhanced rendering quality. Contribution/Results: Our method achieves state-of-the-art performance in monocular reconstruction, novel-view synthesis, and expression transfer, enabling real-time high-fidelity rendering at 75 FPS.

Technology Category

Application Category

📝 Abstract
3D Morphable Models (3DMMs) enable controllable facial geometry and expression editing for reconstruction, animation, and AR/VR, but traditional PCA-based mesh models are limited in resolution, detail, and photorealism. Neural volumetric methods improve realism but remain too slow for interactive use. Recent Gaussian Splatting (3DGS) based facial models achieve fast, high-quality rendering but still depend solely on a mesh-based 3DMM prior for expression control, limiting their ability to capture fine-grained geometry, expressions, and full-head coverage. We introduce GRMM, the first full-head Gaussian 3D morphable model that augments a base 3DMM with residual geometry and appearance components, additive refinements that recover high-frequency details such as wrinkles, fine skin texture, and hairline variations. GRMM provides disentangled control through low-dimensional, interpretable parameters (e.g., identity shape, facial expressions) while separately modelling residuals that capture subject- and expression-specific detail beyond the base model's capacity. Coarse decoders produce vertex-level mesh deformations, fine decoders represent per-Gaussian appearance, and a lightweight CNN refines rasterised images for enhanced realism, all while maintaining 75 FPS real-time rendering. To learn consistent, high-fidelity residuals, we present EXPRESS-50, the first dataset with 60 aligned expressions across 50 identities, enabling robust disentanglement of identity and expression in Gaussian-based 3DMMs. Across monocular 3D face reconstruction, novel-view synthesis, and expression transfer, GRMM surpasses state-of-the-art methods in fidelity and expression accuracy while delivering interactive real-time performance.
Problem

Research questions and friction points this paper is trying to address.

Overcoming PCA-based 3DMM limitations in resolution and photorealism
Enabling real-time high-fidelity facial detail capture and rendering
Achieving disentangled control of identity and expression parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Augments 3DMM with residual geometry and appearance components
Uses coarse decoders for mesh and fine decoders for Gaussian appearance
Employs lightweight CNN for image refinement and real-time rendering
🔎 Similar Papers
No similar papers found.