OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing 3D Gaussian splatting methods lack explicit object-level identity information, hindering their applicability to tasks such as open-vocabulary scene understanding. This work proposes a dual-opacity mechanism that assigns each Gaussian primitive an independent instance identifier and a dedicated instance opacity, decoupling visual appearance from instance occupancy for separate use in image reconstruction and object mask rendering. To prevent label contamination, a stochastic object loss is introduced, and multi-view aggregated semantic descriptors are leveraged without storing per-primitive features. The method achieves open-vocabulary performance comparable to feature-based training approaches while significantly reducing computational overhead, and demonstrates superior physical consistency even without any training pipeline.

📝 Abstract

3D Gaussian Splatting (3DGS) provides an explicit and efficient scene representation, but its primitives lack inherent object-level identity, hindering downstream tasks such as open-vocabulary scene understanding. Existing methods typically address this by either distilling high-dimensional feature embeddings into Gaussians or by lifting 2D mask labels into 3D via heuristic refinement. However, feature-based approaches incur heavy storage and decoding overhead, while lifting-based pipelines remain vulnerable to label contamination: Gaussians necessary for appearance reconstruction often receive incorrect object labels during 2D-to-3D projection. We propose OP2GS, an object-aware Gaussian representation that augments each primitive with an explicit instance identity and a dedicated instance opacity $σ^{*}$ for object-mask rendering. The original opacity $σ$ remains responsible for visual reconstruction, while $σ^{*}$ models whether a Gaussian should contribute to a particular object mask. This dual-opacity formulation decouples visual existence from instance occupancy: mislabeled Gaussians can remain available for image rendering while becoming transparent in the object-mask branch. To learn this representation, we introduce a random object loss that optimizes the 1D instance occupancy field using the standard transmittance-based visibility of 3DGS. Semantic descriptors are then attached at the object level through multi-view aggregation, eliminating per-Gaussian feature storage. Compared with feature-training approaches, OP2GS achieves competitive open-vocabulary performance while significantly reducing computational overhead. Compared with training-free pipelines, it leverages physically consistent occupancy learning to resolve visibility ambiguities.

Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting

object-aware representation

open-vocabulary scene understanding

label contamination

instance identity

Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-opacity

object-aware representation

3D Gaussian Splatting