STG-Avatar: Animatable Human Avatars via Spacetime Gaussian

πŸ“… 2025-10-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing 3D Gaussian Splatting (3DGS) methods struggle to reconstruct high-fidelity, animatable human avatars from monocular video due to difficulties modeling non-rigid clothing deformation and rapid limb motion. To address this, we propose a rigid–non-rigid coupled deformation framework. Our method integrates spatiotemporal Gaussians (STGs) for dynamic geometry representation, linear blend skinning (LBS) for skeletal articulation, and an optical-flow-guided adaptive densification strategy to refine dynamic regions. Unlike pure 3DGS approaches, ours preserves real-time rendering capability while significantly improving fidelity of dynamic details and skeletal control accuracy. Experiments demonstrate state-of-the-art quantitative performance across multiple public benchmarks (PSNR/SSIM/LPIPS), alongside end-to-end differentiability and real-time interactive animation support.

Technology Category

Application Category

πŸ“ Abstract
Realistic animatable human avatars from monocular videos are crucial for advancing human-robot interaction and enhancing immersive virtual experiences. While recent research on 3DGS-based human avatars has made progress, it still struggles with accurately representing detailed features of non-rigid objects (e.g., clothing deformations) and dynamic regions (e.g., rapidly moving limbs). To address these challenges, we present STG-Avatar, a 3DGS-based framework for high-fidelity animatable human avatar reconstruction. Specifically, our framework introduces a rigid-nonrigid coupled deformation framework that synergistically integrates Spacetime Gaussians (STG) with linear blend skinning (LBS). In this hybrid design, LBS enables real-time skeletal control by driving global pose transformations, while STG complements it through spacetime adaptive optimization of 3D Gaussians. Furthermore, we employ optical flow to identify high-dynamic regions and guide the adaptive densification of 3D Gaussians in these regions. Experimental results demonstrate that our method consistently outperforms state-of-the-art baselines in both reconstruction quality and operational efficiency, achieving superior quantitative metrics while retaining real-time rendering capabilities. Our code is available at https://github.com/jiangguangan/STG-Avatar
Problem

Research questions and friction points this paper is trying to address.

Creating realistic animatable human avatars from monocular videos
Accurately representing detailed clothing deformations and dynamic regions
Achieving high-fidelity reconstruction with real-time rendering capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid rigid-nonrigid deformation with Spacetime Gaussians
Optical flow guides adaptive Gaussian densification
Real-time skeletal control via linear blend skinning
πŸ”Ž Similar Papers
No similar papers found.
G
Guangan Jiang
School of Information and Communication Engineering, Dalian University of Technology, Dalian, China
T
Tianzi Zhang
School of Information Science and Technology, Fudan University, Shanghai, China
D
Dong Li
Faculty of Science and Technology, University of Macau, and also with WAYTOUS Inc., Beijing, China
Zhenjun Zhao
Zhenjun Zhao
Universidad de Zaragoza
Computer vision3D visionRobotics
Haoang Li
Haoang Li
Assistant Professor, Hong Kong University of Science and Technology (Guangzhou)
Robotics3D Computer Vision
Mingrui Li
Mingrui Li
Dalian University of Technology
SLAM3D VisionRobotics
H
Hongyu Wang
School of Information and Communication Engineering, Dalian University of Technology, Dalian, China