Rendering Multi-Human and Multi-Object with 3D Gaussian Splatting

📅 2026-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Reconstructing dynamic scenes involving multiple people and objects from sparse viewpoints poses significant challenges due to severe occlusions and the complexity of modeling intricate interactions. This work proposes MM-GS, a novel framework that extends 3D Gaussian Splatting to such scenarios for the first time. It introduces a hierarchical representation comprising an instance-wise multi-view fusion module to enforce cross-view consistency and a scene-level instance interaction module that reasons about participant relationships on a global scene graph to jointly refine geometric and appearance attributes. Evaluated on several challenging datasets, MM-GS substantially outperforms existing methods, achieving high-fidelity detail reconstruction and physically plausible object contact effects.
📝 Abstract
Reconstructing dynamic scenes with multiple interacting humans and objects from sparse-view inputs is a critical yet challenging task, essential for creating high-fidelity digital twins for robotics and VR/AR. This problem, which we term Multi-Human Multi-Object (MHMO) rendering, presents two significant obstacles: achieving view-consistent representations for individual instances under severe mutual occlusion, and explicitly modeling the complex and combinatorial dependencies that arise from their interactions. To overcome these challenges, we propose MM-GS, a novel hierarchical framework built upon 3D Gaussian Splatting. Our method first employs a Per-Instance Multi-View Fusion module to establish a robust and consistent representation for each instance by aggregating visual information across all available views. Subsequently, a Scene-Level Instance Interaction module operates on a global scene graph to reason about relationships between all participants, refining their attributes to capture subtle interaction effects. Extensive experiments on challenging datasets demonstrate that our method significantly outperforms strong baselines, producing state-of-the-art results with high-fidelity details and plausible inter-instance contacts.
Problem

Research questions and friction points this paper is trying to address.

Multi-Human Multi-Object rendering
dynamic scene reconstruction
mutual occlusion
instance interaction
sparse-view inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian Splatting
multi-human rendering
instance interaction modeling
sparse-view reconstruction
scene graph reasoning
🔎 Similar Papers
No similar papers found.
Weiquan Wang
Weiquan Wang
Chinese University of Hong Kong
Human-AI/Algorithm InteractionInformation PrivacyRecommendation SystemsSocial Media
J
Jun Xiao
State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, Hangzhou 310027, China
Feifei Shao
Feifei Shao
Zhejiang Univiersity
Machine learningcomputer visionweakly supervised learningactive learning
Yi Yang
Yi Yang
Zhejiang University
multimediacomputer visionmachine learning
Y
Yueting Zhuang
State Key Lab of CAD&CG, College of Computer Science, Zhejiang University, Hangzhou 310027, China
L
Long Chen
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong