๐ค AI Summary
Existing human-centric volumetric video methods are largely confined to dynamic scene replay or character animation, lacking high-fidelity reenactment capability for general dynamic scenes. To address this, we propose the first human-centric volumetric video framework enabling unified โreplay โ reenactmentโ modeling. Our approach introduces a hierarchical, disentangled Gaussian representation for motion and appearance, augmented by a semantic-aware alignment module and a deformation-transfer-based motion retargeting mechanism. Integrating Gaussian splatting, Morton encoding, a 2D position-to-attribute mapping CNN, and canonical-space modeling, our method achieves efficient multi-view reconstruction and photorealistic novel-pose rendering. Extensive evaluations on standard benchmarks demonstrate comprehensive superiority over state-of-the-art methods, establishing new paradigmatic benchmarks in reconstruction accuracy, reenactment fidelity, and generalization capability.
๐ Abstract
Human-centric volumetric videos offer immersive free-viewpoint experiences, yet existing methods focus either on replaying general dynamic scenes or animating human avatars, limiting their ability to re-perform general dynamic scenes. In this paper, we present RePerformer, a novel Gaussian-based representation that unifies playback and re-performance for high-fidelity human-centric volumetric videos. Specifically, we hierarchically disentangle the dynamic scenes into motion Gaussians and appearance Gaussians which are associated in the canonical space. We further employ a Morton-based parameterization to efficiently encode the appearance Gaussians into 2D position and attribute maps. For enhanced generalization, we adopt 2D CNNs to map position maps to attribute maps, which can be assembled into appearance Gaussians for high-fidelity rendering of the dynamic scenes. For re-performance, we develop a semantic-aware alignment module and apply deformation transfer on motion Gaussians, enabling photo-real rendering under novel motions. Extensive experiments validate the robustness and effectiveness of RePerformer, setting a new benchmark for playback-then-reperformance paradigm in human-centric volumetric videos.