CoPa-SG: Dense Scene Graphs with Parametric and Proto-Relations

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 2D scene graph research is hindered by sparse, coarse-grained (binary, discrete) relationship annotations in real-world data and insufficient modeling of latent relationships. To address this, we propose CoPa-SG—the first high-fidelity synthetic scene graph dataset that comprehensively covers all object pairs, incorporates fine-grained parametric relations (e.g., relative angle, distance), and introduces prototype relations encoding hypothetical associations triggered by novel objects. Methodologically, we integrate geometric priors with differentiable rendering to generate dense, pixel-accurate annotations, and design a structured, differentiable relational representation paradigm. This framework transcends conventional discrete relation modeling, substantially enhancing expressive power and prospective reasoning capability. Evaluations across multiple vision-language models demonstrate that our novel relation types improve mean Recall@100 for relationship prediction in visual-language navigation and robotic planning tasks by 12.3%. Moreover, CoPa-SG enables downstream integration with causal reasoning and task planning frameworks.

Technology Category

Application Category

📝 Abstract
2D scene graphs provide a structural and explainable framework for scene understanding. However, current work still struggles with the lack of accurate scene graph data. To overcome this data bottleneck, we present CoPa-SG, a synthetic scene graph dataset with highly precise ground truth and exhaustive relation annotations between all objects. Moreover, we introduce parametric and proto-relations, two new fundamental concepts for scene graphs. The former provides a much more fine-grained representation than its traditional counterpart by enriching relations with additional parameters such as angles or distances. The latter encodes hypothetical relations in a scene graph and describes how relations would form if new objects are placed in the scene. Using CoPa-SG, we compare the performance of various scene graph generation models. We demonstrate how our new relation types can be integrated in downstream applications to enhance planning and reasoning capabilities.
Problem

Research questions and friction points this paper is trying to address.

Lack of accurate scene graph data for understanding
Need for fine-grained and hypothetical relation representations
Improving scene graph models for planning and reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic dataset with precise ground truth
Parametric relations for fine-grained representation
Proto-relations for hypothetical relation encoding
🔎 Similar Papers
No similar papers found.