InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address geometric and photometric distortions in hand–face dynamic interaction modeling for digital humans, this paper proposes the first high-fidelity co-reconstruction and reenactment framework compatible with both monocular and multi-view video input. Methodologically, we introduce a dynamic Gaussian hand model coupled with an explicit hand–face interaction module to jointly capture pose-dependent wrinkles, cast shadows, and contact geometry. By integrating template priors, 3D Gaussian splatting, and dynamic refinement, our approach achieves photometrically consistent and geometrically precise co-reconstruction via implicit geometry-appearance joint optimization and multi-view consistency constraints. Experiments demonstrate significant improvements in novel-view synthesis and self/cross-identity reenactment, particularly in preserving fine-grained interaction details. To our knowledge, this is the first method to robustly reconstruct complex hand–face interactions—such as face pinching, forehead touching, and mouth covering—at high visual fidelity, establishing new state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
With the rising interest from the community in digital avatars coupled with the importance of expressions and gestures in communication, modeling natural avatar behavior remains an important challenge across many industries such as teleconferencing, gaming, and AR/VR. Human hands are the primary tool for interacting with the environment and essential for realistic human behavior modeling, yet existing 3D hand and head avatar models often overlook the crucial aspect of hand-body interactions, such as between hand and face. We present InteracttAvatar, the first model to faithfully capture the photorealistic appearance of dynamic hand and non-rigid hand-face interactions. Our novel Dynamic Gaussian Hand model, combining template model and 3D Gaussian Splatting as well as a dynamic refinement module, captures pose-dependent change, e.g. the fine wrinkles and complex shadows that occur during articulation. Importantly, our hand-face interaction module models the subtle geometry and appearance dynamics that underlie common gestures. Through experiments of novel view synthesis, self reenactment and cross-identity reenactment, we demonstrate that InteracttAvatar can reconstruct hand and hand-face interactions from monocular or multiview videos with high-fidelity details and be animated with novel poses.
Problem

Research questions and friction points this paper is trying to address.

Modeling photorealistic hand-face interactions in avatars
Capturing dynamic hand gestures with high-fidelity details
Reconstructing interactions from monocular or multiview videos
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Gaussian Hand model with refinement
Hand-face interaction module for gestures
High-fidelity reconstruction from monocular videos
🔎 Similar Papers
No similar papers found.