RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

๐Ÿ“… 2025-03-21
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses template-free 3D modeling of articulated objects from monocular video, targeting high-fidelity novel-view synthesis, editability, and controllable animation. We propose a skeleton-guided differentiable 3D Gaussian deformation framework: (i) a novel skeleton-aware node control mechanism that automatically extracts a sparse, semantics-motion-coupled skeletal structure directly from the Gaussian field; and (ii) learnable skinning weights combined with a pose-dependent neural deformation module to jointly optimize geometry and appearance. Our method requires no manual initialization or category-specific priors. Evaluated on diverse articulated object videos, it significantly outperforms existing template-free approaches. It enables real-time pose retargeting, motion transfer, and interactive editingโ€”while preserving high-fidelity, photorealistic novel-view rendering.

Technology Category

Application Category

๐Ÿ“ Abstract
This paper considers the problem of modeling articulated objects captured in 2D videos to enable novel view synthesis, while also being easily editable, drivable, and re-posable. To tackle this challenging problem, we propose RigGS, a new paradigm that leverages 3D Gaussian representation and skeleton-based motion representation to model dynamic objects without utilizing additional template priors. Specifically, we first propose skeleton-aware node-controlled deformation, which deforms a canonical 3D Gaussian representation over time to initialize the modeling process, producing candidate skeleton nodes that are further simplified into a sparse 3D skeleton according to their motion and semantic information. Subsequently, based on the resulting skeleton, we design learnable skin deformations and pose-dependent detailed deformations, thereby easily deforming the 3D Gaussian representation to generate new actions and render further high-quality images from novel views. Extensive experiments demonstrate that our method can generate realistic new actions easily for objects and achieve high-quality rendering.
Problem

Research questions and friction points this paper is trying to address.

Modeling articulated objects in 2D videos for novel view synthesis
Enabling editable, drivable, and re-posable 3D Gaussian representations
Generating realistic new actions without template priors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Skeleton-aware node-controlled deformation for initialization
Learnable skin deformations for new actions
Pose-dependent detailed deformations for rendering
๐Ÿ”Ž Similar Papers
No similar papers found.