RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars

πŸ“… 2025-04-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing methods for single- and multi-view video-driven animatable 3D human modeling struggle to accurately reconstruct non-rigid dynamics of loose clothing, often yielding geometric distortions, temporal inconsistencies, and oversmoothing. This work proposes an efficient 3D Gaussian Splatting (3DGS)-based framework featuring two key innovations: a motion-trend module and an implicit skeletal encoderβ€”both enabling explicit disentanglement of pose-dependent deformation from temporal evolution, thereby overcoming modeling bottlenecks imposed by global pose conditioning. Leveraging multi-view video supervision, our approach achieves geometrically consistent and temporally stable high-fidelity clothing deformation reconstruction. Evaluated on standard benchmarks, the method significantly improves structural fidelity and perceptual quality in non-rigid regions: clothing details appear more realistic, and inter-frame consistency is substantially enhanced.

Technology Category

Application Category

πŸ“ Abstract
Modeling animatable human avatars from monocular or multi-view videos has been widely studied, with recent approaches leveraging neural radiance fields (NeRFs) or 3D Gaussian Splatting (3DGS) achieving impressive results in novel-view and novel-pose synthesis. However, existing methods often struggle to accurately capture the dynamics of loose clothing, as they primarily rely on global pose conditioning or static per-frame representations, leading to oversmoothing and temporal inconsistencies in non-rigid regions. To address this, We propose RealityAvatar, an efficient framework for high-fidelity digital human modeling, specifically targeting loosely dressed avatars. Our method leverages 3D Gaussian Splatting to capture complex clothing deformations and motion dynamics while ensuring geometric consistency. By incorporating a motion trend module and a latentbone encoder, we explicitly model pose-dependent deformations and temporal variations in clothing behavior. Extensive experiments on benchmark datasets demonstrate the effectiveness of our approach in capturing fine-grained clothing deformations and motion-driven shape variations. Our method significantly enhances structural fidelity and perceptual quality in dynamic human reconstruction, particularly in non-rigid regions, while achieving better consistency across temporal frames.
Problem

Research questions and friction points this paper is trying to address.

Modeling loose clothing dynamics in 3D avatars
Overcoming oversmoothing in non-rigid regions
Ensuring temporal consistency in clothing deformations
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian Splatting for clothing dynamics
Motion trend module for deformations
Latent-bone encoder for temporal consistency
πŸ”Ž Similar Papers
No similar papers found.
Y
Yahui Li
Beijing University of Posts and Telecommunications
Zhi Zeng
Zhi Zeng
Xi'an Jiaotong University
Natural Language ProcessingData MiningMultimodal LearningFake NewsShort Video
L
Liming Pang
Beijing University of Posts and Telecommunications
G
Guixuan Zhang
Beijing University of Posts and Telecommunications
S
Shuwu Zhang
Beijing University of Posts and Telecommunications