SegSplat: Feed-forward Gaussian Splatting and Open-Set Semantic Segmentation

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the challenge of real-time, queryable open-vocabulary semantic 3D reconstruction. Methodologically, it introduces the first feed-forward semantic Gaussian splatting framework, pioneering the integration of open-set semantic segmentation into the 3D Gaussian splatting pipeline: multi-view vision-language features are extracted via 2D foundation models to construct a compact semantic memory bank, enabling joint prediction—in a single forward pass—of geometry, appearance, and open-vocabulary semantic indices for each Gaussian ellipsoid, without scene-level optimization. Compared to existing approaches, our method achieves state-of-the-art geometric fidelity while delivering robust pixel-level open-vocabulary semantic labeling. This significantly enhances semantic queryability and generalization across unseen categories in 3D scenes, establishing an efficient and scalable semantic 3D foundation for applications such as robotic interaction and augmented reality.

Technology Category

Application Category

📝 Abstract

We have introduced SegSplat, a novel framework designed to bridge the gap between rapid, feed-forward 3D reconstruction and rich, open-vocabulary semantic understanding. By constructing a compact semantic memory bank from multi-view 2D foundation model features and predicting discrete semantic indices alongside geometric and appearance attributes for each 3D Gaussian in a single pass, SegSplat efficiently imbues scenes with queryable semantics. Our experiments demonstrate that SegSplat achieves geometric fidelity comparable to state-of-the-art feed-forward 3D Gaussian Splatting methods while simultaneously enabling robust open-set semantic segmentation, crucially extit{without} requiring any per-scene optimization for semantic feature integration. This work represents a significant step towards practical, on-the-fly generation of semantically aware 3D environments, vital for advancing robotic interaction, augmented reality, and other intelligent systems.

Problem

Research questions and friction points this paper is trying to address.

Bridging rapid 3D reconstruction with open-vocabulary semantic understanding

Enabling queryable semantics in 3D scenes without per-scene optimization

Achieving geometric fidelity while providing robust open-set semantic segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructs semantic memory bank from 2D features

Predicts semantic indices with geometric attributes simultaneously

Enables open-set segmentation without per-scene optimization

🔎 Similar Papers

Segment Any 3D Gaussians