RMAvatar: Photorealistic Human Avatar Reconstruction from Monocular Video Based on Rectified Mesh-embedded Gaussians

📅 2025-01-13

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

This work addresses the challenge of reconstructing high-fidelity, photorealistic, and naturally articulated dressed human avatars from monocular video. Conventional linear blend skinning (LBS) struggles to model complex non-rigid deformations, particularly under clothing. To overcome this, we propose a mesh-embedded Gaussian splatting representation: an explicit triangular mesh governs coarse geometry and motion, while an implicit, differentiable Gaussian splatting renderer captures fine-grained appearance details. We further introduce the first patch-wise Gaussian embedding scheme coupled with a pose-dependent neural correction module, unifying geometric consistency and dynamic detail synthesis. Our method integrates skinned mesh deformation, LBS-based initialization, and neural non-rigid refinement. Evaluated on standard benchmarks, it achieves state-of-the-art performance in rendering quality (PSNR/SSIM/LPIPS) and reconstruction accuracy, significantly enhancing avatar photorealism, motion expressiveness, and generalization across diverse poses and garments.

Technology Category

Application Category

📝 Abstract

We introduce RMAvatar, a novel human avatar representation with Gaussian splatting embedded on mesh to learn clothed avatar from a monocular video. We utilize the explicit mesh geometry to represent motion and shape of a virtual human and implicit appearance rendering with Gaussian Splatting. Our method consists of two main modules: Gaussian initialization module and Gaussian rectification module. We embed Gaussians into triangular faces and control their motion through the mesh, which ensures low-frequency motion and surface deformation of the avatar. Due to the limitations of LBS formula, the human skeleton is hard to control complex non-rigid transformations. We then design a pose-related Gaussian rectification module to learn fine-detailed non-rigid deformations, further improving the realism and expressiveness of the avatar. We conduct extensive experiments on public datasets, RMAvatar shows state-of-the-art performance on both rendering quality and quantitative evaluations. Please see our project page at https://rm-avatar.github.io.

Problem

Research questions and friction points this paper is trying to address.

Virtual Human Reconstruction

Single Camera Video

High-Quality Rendering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splattering

High-fidelity Avatar Reconstruction

Natural Motion Control

🔎 Similar Papers

Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Video