OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
Existing 3D Gaussian splatting methods lack explicit object-level identity information, hindering their applicability to tasks such as open-vocabulary scene understanding. This work proposes a dual-opacity mechanism that assigns each Gaussian primitive an independent instance identifier and a dedicated instance opacity, decoupling visual appearance from instance occupancy for separate use in image reconstruction and object mask rendering. To prevent label contamination, a stochastic object loss is introduced, and multi-view aggregated semantic descriptors are leveraged without storing per-primitive features. The method achieves open-vocabulary performance comparable to feature-based training approaches while significantly reducing computational overhead, and demonstrates superior physical consistency even without any training pipeline.
📝 Abstract
3D Gaussian Splatting (3DGS) provides an explicit and efficient scene representation, but its primitives lack inherent object-level identity, hindering downstream tasks such as open-vocabulary scene understanding. Existing methods typically address this by either distilling high-dimensional feature embeddings into Gaussians or by lifting 2D mask labels into 3D via heuristic refinement. However, feature-based approaches incur heavy storage and decoding overhead, while lifting-based pipelines remain vulnerable to label contamination: Gaussians necessary for appearance reconstruction often receive incorrect object labels during 2D-to-3D projection. We propose OP2GS, an object-aware Gaussian representation that augments each primitive with an explicit instance identity and a dedicated instance opacity $σ^{*}$ for object-mask rendering. The original opacity $σ$ remains responsible for visual reconstruction, while $σ^{*}$ models whether a Gaussian should contribute to a particular object mask. This dual-opacity formulation decouples visual existence from instance occupancy: mislabeled Gaussians can remain available for image rendering while becoming transparent in the object-mask branch. To learn this representation, we introduce a random object loss that optimizes the 1D instance occupancy field using the standard transmittance-based visibility of 3DGS. Semantic descriptors are then attached at the object level through multi-view aggregation, eliminating per-Gaussian feature storage. Compared with feature-training approaches, OP2GS achieves competitive open-vocabulary performance while significantly reducing computational overhead. Compared with training-free pipelines, it leverages physically consistent occupancy learning to resolve visibility ambiguities.
Problem

Research questions and friction points this paper is trying to address.

3D Gaussian Splatting
object-aware representation
open-vocabulary scene understanding
label contamination
instance identity
Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-opacity
object-aware representation
3D Gaussian Splatting
instance occupancy
open-vocabulary scene understanding
🔎 Similar Papers
No similar papers found.