Inst4DGS: Instance-Decomposed 4D Gaussian Splatting with Multi-Video Label Permutation Learning

📅 2026-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of dynamic scene decomposition and long-term identity tracking in multi-view videos caused by inconsistent instance labels across views. To this end, we propose a cross-video instance matching framework that leverages a latent label-permutation variable coupled with a differentiable Sinkhorn layer, along with an instance-decomposed motion skeleton to refine long-horizon 4D Gaussian trajectories. Our approach achieves, for the first time, stable instance-level 4D Gaussian splatting reconstruction without identity drift. Evaluated on the Panoptic Studio dataset, the method attains a PSNR of 28.36 (+2.26) and improves instance mIoU to 0.9129 (+0.2819), significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract
We present Inst4DGS, an instance-decomposed 4D Gaussian Splatting (4DGS) approach with long-horizon per-Gaussian trajectories. While dynamic 4DGS has advanced rapidly, instance-decomposed 4DGS remains underexplored, largely due to the difficulty of associating inconsistent instance labels across independently segmented multi-view videos. We address this challenge by introducing per-video label-permutation latents that learn cross-video instance matches through a differentiable Sinkhorn layer, enabling direct multi-view supervision with consistent identity preservation. This explicit label alignment yields sharp decision boundaries and temporally stable identities without identity drift. To further improve efficiency, we propose instance-decomposed motion scaffolds that provide low-dimensional motion bases per object for long-horizon trajectory optimization. Experiments on Panoptic Studio and Neural3DV show that Inst4DGS jointly supports tracking and instance decomposition while achieving state-of-the-art rendering and segmentation quality. On the Panoptic Studio dataset, Inst4DGS improves PSNR from 26.10 to 28.36, and instance mIoU from 0.6310 to 0.9129, over the strongest baseline.
Problem

Research questions and friction points this paper is trying to address.

instance decomposition
4D Gaussian Splatting
multi-view video
label inconsistency
identity association
Innovation

Methods, ideas, or system contributions that make the work stand out.

Instance-Decomposed 4D Gaussian Splatting
Label Permutation Learning
Differentiable Sinkhorn Layer
Motion Scaffolds
Multi-View Identity Alignment
🔎 Similar Papers
No similar papers found.