๐ค AI Summary
To address the challenge of real-time, high-fidelity, animatable rendering of clothed digital humans from monocular video, this paper introduces the first 2D Gaussian Splatting (2DGS)-based rendering paradigm. By relocating Gaussian modeling from 3D volumetric space to the 2D image plane, our approach fundamentally avoids errors inherent in 3D geometry reconstruction while preserving dynamic details, photorealism, and computational efficiency. The method integrates SMPL-X skinning, weak supervision from monocular video, and differentiable rasterization. Evaluated on AvatarRex and THuman4.0, it achieves state-of-the-art qualitative and quantitative results: 2.3ร faster training and real-time rendering at over 60 FPSโsignificantly outperforming 3DGS baselines. Our core contribution lies in the co-design of pose-driven rendering and a novel 2D Gaussian representation, enabling robust, efficient, and geometry-free animation of clothed avatars.
๐ Abstract
Real-time rendering of high-fidelity and animatable avatars from monocular videos remains a challenging problem in computer vision and graphics. Over the past few years, the Neural Radiance Field (NeRF) has made significant progress in rendering quality but behaves poorly in run-time performance due to the low efficiency of volumetric rendering. Recently, methods based on 3D Gaussian Splatting (3DGS) have shown great potential in fast training and real-time rendering. However, they still suffer from artifacts caused by inaccurate geometry. To address these problems, we propose 2DGS-Avatar, a novel approach based on 2D Gaussian Splatting (2DGS) for modeling animatable clothed avatars with high-fidelity and fast training performance. Given monocular RGB videos as input, our method generates an avatar that can be driven by poses and rendered in real-time. Compared to 3DGS-based methods, our 2DGS-Avatar retains the advantages of fast training and rendering while also capturing detailed, dynamic, and photo-realistic appearances. We conduct abundant experiments on popular datasets such as AvatarRex and THuman4.0, demonstrating impressive performance in both qualitative and quantitative metrics.