🤖 AI Summary
Existing 3D scene graph prediction methods predominantly adopt object-centric graph neural networks (GNNs), which struggle to capture high-order relational dependencies among entities.
Method: This paper proposes a relation-centric inference paradigm: it transforms the original object-centric graph into a line graph—where relations become nodes and higher-order associations become edges—and designs a novel edge-centric line graph neural network. A link-guided mechanism is introduced to suppress noisy relations, enabling progressive fusion of relation-level contextual information into object-level understanding. The framework is modular and compatible with arbitrary baseline methods. Key components include line graph construction, object-aware feature fusion, link prediction, and multi-granularity message passing.
Results: On the 3DSSG benchmark, our method consistently outperforms two strong baselines across all metrics, validating both the effectiveness and generalizability of the relation-to-object reasoning paradigm.
📝 Abstract
3D scene graph prediction aims to abstract complex 3D environments into structured graphs consisting of objects and their pairwise relationships. Existing approaches typically adopt object-centric graph neural networks, where relation edge features are iteratively updated by aggregating messages from connected object nodes. However, this design inherently restricts relation representations to pairwise object context, making it difficult to capture high-order relational dependencies that are essential for accurate relation prediction. To address this limitation, we propose a Link-guided Edge-centric relational reasoning framework with Object-aware fusion, namely LEO, which enables progressive reasoning from relation-level context to object-level understanding. Specifically, LEO first predicts potential links between object pairs to suppress irrelevant edges, and then transforms the original scene graph into a line graph where each relation is treated as a node. A line graph neural network is applied to perform edge-centric relational reasoning to capture inter-relation context. The enriched relation features are subsequently integrated into the original object-centric graph to enhance object-level reasoning and improve relation prediction. Our framework is model-agnostic and can be integrated with any existing object-centric method. Experiments on the 3DSSG dataset with two competitive baselines show consistent improvements, highlighting the effectiveness of our edge-to-object reasoning paradigm.