RMAvatar: Photorealistic Human Avatar Reconstruction from Monocular Video Based on Rectified Mesh-embedded Gaussians

📅 2025-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of reconstructing high-fidelity, photorealistic, and naturally articulated dressed human avatars from monocular video. Conventional linear blend skinning (LBS) struggles to model complex non-rigid deformations, particularly under clothing. To overcome this, we propose a mesh-embedded Gaussian splatting representation: an explicit triangular mesh governs coarse geometry and motion, while an implicit, differentiable Gaussian splatting renderer captures fine-grained appearance details. We further introduce the first patch-wise Gaussian embedding scheme coupled with a pose-dependent neural correction module, unifying geometric consistency and dynamic detail synthesis. Our method integrates skinned mesh deformation, LBS-based initialization, and neural non-rigid refinement. Evaluated on standard benchmarks, it achieves state-of-the-art performance in rendering quality (PSNR/SSIM/LPIPS) and reconstruction accuracy, significantly enhancing avatar photorealism, motion expressiveness, and generalization across diverse poses and garments.

Technology Category

Application Category

📝 Abstract
We introduce RMAvatar, a novel human avatar representation with Gaussian splatting embedded on mesh to learn clothed avatar from a monocular video. We utilize the explicit mesh geometry to represent motion and shape of a virtual human and implicit appearance rendering with Gaussian Splatting. Our method consists of two main modules: Gaussian initialization module and Gaussian rectification module. We embed Gaussians into triangular faces and control their motion through the mesh, which ensures low-frequency motion and surface deformation of the avatar. Due to the limitations of LBS formula, the human skeleton is hard to control complex non-rigid transformations. We then design a pose-related Gaussian rectification module to learn fine-detailed non-rigid deformations, further improving the realism and expressiveness of the avatar. We conduct extensive experiments on public datasets, RMAvatar shows state-of-the-art performance on both rendering quality and quantitative evaluations. Please see our project page at https://rm-avatar.github.io.
Problem

Research questions and friction points this paper is trying to address.

Virtual Human Reconstruction
Single Camera Video
High-Quality Rendering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splattering
High-fidelity Avatar Reconstruction
Natural Motion Control
🔎 Similar Papers
2024-07-21IEEE Transactions on Pattern Analysis and Machine IntelligenceCitations: 7
S
Sen Peng
College of Computer Engineering, Jimei University, Xiamen, China
W
Weixing Xie
National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
Z
Zilong Wang
Department of Computer Science, The University of Texas at Dallas, Richardson, United States
Xiaohu Guo
Xiaohu Guo
University of Texas at Dallas
Computer GraphicsComputer VisionGeometric Computing
Z
Zhonggui Chen
School of Informatics, Xiamen University, Xiamen, China
Baorong Yang
Baorong Yang
College of Computer Engineering, Jimei University, Xiamen, China
Xiao Dong
Xiao Dong
Unknown affiliation
DM CV ML