Bayesian Fields: Task-driven Open-Set Semantic Gaussian Splatting

📅 2025-03-07

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This paper addresses open-vocabulary semantic 3D mapping, tackling two core challenges: (i) dynamic determination of scene semantic granularity—i.e., “what constitutes an object”—and (ii) robust fusion of multi-view 2D semantic observations into compact, high-fidelity 3D reconstructions. To this end, we propose a task-driven, semantic granularity-adaptive mechanism and introduce the first Bayesian semantic fusion framework grounded in visual-language model (VLM) priors—replacing conventional embedding averaging. Our method integrates semantic Gaussian splatting representation, 3D Gaussian clustering, and object-centric extraction to jointly enable open-vocabulary semantic recognition and dense scene reconstruction. Experiments demonstrate substantial improvements in semantic consistency and geometric fidelity, with a 42% reduction in memory footprint compared to SemGS. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Open-set semantic mapping requires (i) determining the correct granularity to represent the scene (e.g., how should objects be defined), and (ii) fusing semantic knowledge across multiple 2D observations into an overall 3D reconstruction -ideally with a high-fidelity yet low-memory footprint. While most related works bypass the first issue by grouping together primitives with similar semantics (according to some manually tuned threshold), we recognize that the object granularity is task-dependent, and develop a task-driven semantic mapping approach. To address the second issue, current practice is to average visual embedding vectors over multiple views. Instead, we show the benefits of using a probabilistic approach based on the properties of the underlying visual-language foundation model, and leveraging Bayesian updating to aggregate multiple observations of the scene. The result is Bayesian Fields, a task-driven and probabilistic approach for open-set semantic mapping. To enable high-fidelity objects and a dense scene representation, Bayesian Fields uses 3D Gaussians which we cluster into task-relevant objects, allowing for both easy 3D object extraction and reduced memory usage. We release Bayesian Fields open-source at https: //github.com/MIT-SPARK/Bayesian-Fields.

Problem

Research questions and friction points this paper is trying to address.

Determining task-dependent object granularity in semantic mapping.

Fusing 2D semantic observations into high-fidelity 3D reconstructions.

Reducing memory usage while maintaining dense scene representation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Task-driven semantic mapping approach

Probabilistic Bayesian updating for scene aggregation

3D Gaussians clustering for memory efficiency

🔎 Similar Papers

No similar papers found.