OpenLex3D: A New Evaluation Benchmark for Open-Vocabulary 3D Scene Representations

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing 3D open-vocabulary language model evaluations are confined to closed-set semantics, failing to capture linguistic diversity and real-world scene complexity. To address this, we propose OpenLex3D—the first benchmark for open-vocabulary 3D scene understanding—comprising 23 real-world scenes synthesized from cross-source datasets (Replica, ScanNet++, and HM3D). It introduces synonym-rich category taxonomies and multi-granular natural language descriptions, supporting open-set 3D semantic segmentation and object retrieval. We formally define and implement the first 3D open-vocabulary evaluation paradigm, underpinned by a novel annotation framework grounded in linguistic diversity. Comprehensive benchmarking exposes critical limitations of state-of-the-art methods in feature discriminability, segmentation consistency, and downstream generalization. OpenLex3D is publicly released to advance standardized, rigorous progress in 3D vision-language understanding.

Technology Category

Application Category

📝 Abstract

3D scene understanding has been transformed by open-vocabulary language models that enable interaction via natural language. However, the evaluation of these representations is limited to closed-set semantics that do not capture the richness of language. This work presents OpenLex3D, a dedicated benchmark to evaluate 3D open-vocabulary scene representations. OpenLex3D provides entirely new label annotations for 23 scenes from Replica, ScanNet++, and HM3D, which capture real-world linguistic variability by introducing synonymical object categories and additional nuanced descriptions. By introducing an open-set 3D semantic segmentation task and an object retrieval task, we provide insights on feature precision, segmentation, and downstream capabilities. We evaluate various existing 3D open-vocabulary methods on OpenLex3D, showcasing failure cases, and avenues for improvement. The benchmark is publicly available at: https://openlex3d.github.io/.

Problem

Research questions and friction points this paper is trying to address.

Evaluating 3D open-vocabulary scene representations comprehensively

Addressing limited closed-set semantics in 3D scene understanding

Introducing new benchmark for open-set 3D semantic segmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces OpenLex3D benchmark for 3D scenes

Includes synonymical categories and nuanced descriptions

Evaluates feature precision and segmentation capabilities

🔎 Similar Papers

OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding