Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation

📅 2025-06-21

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Existing Transformer-based point cloud instance segmentation methods primarily model external relationships between scenes and queries, neglecting both the internal structural characteristics of scene features and intrinsic correlations among queries. To address this, we propose a unified framework that jointly models internal and external relationships. First, we design an adaptive superpoint aggregation module coupled with contrastive learning to optimize scene feature representation. Second, we introduce a geometry-aware self-attention mechanism incorporating geometric position encoding to explicitly capture fine-grained inter-query dependencies. The entire method is seamlessly integrated into a Transformer architecture without requiring post-processing. Our approach achieves state-of-the-art performance on ScanNetV2, ScanNet++, ScanNet200, and S3DIS, significantly improving instance discrimination accuracy and cross-dataset generalization. Extensive experiments validate the effectiveness of jointly modeling internal feature structure and external query relationships.

Technology Category

Application Category

📝 Abstract

3D instance segmentation aims to predict a set of object instances in a scene, representing them as binary foreground masks with corresponding semantic labels. Currently, transformer-based methods are gaining increasing attention due to their elegant pipelines and superior predictions. However, these methods primarily focus on modeling the external relationships between scene features and query features through mask attention. They lack effective modeling of the internal relationships among scene features as well as between query features. In light of these disadvantages, we propose extbf{Relation3D: Enhancing Relation Modeling for Point Cloud Instance Segmentation}. Specifically, we introduce an adaptive superpoint aggregation module and a contrastive learning-guided superpoint refinement module to better represent superpoint features (scene features) and leverage contrastive learning to guide the updates of these features. Furthermore, our relation-aware self-attention mechanism enhances the capabilities of modeling relationships between queries by incorporating positional and geometric relationships into the self-attention mechanism. Extensive experiments on the ScanNetV2, ScanNet++, ScanNet200 and S3DIS datasets demonstrate the superior performance of Relation3D.

Problem

Research questions and friction points this paper is trying to address.

Enhancing internal and external relation modeling in 3D instance segmentation

Improving superpoint feature representation with adaptive aggregation and refinement

Incorporating positional and geometric relationships into query feature modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive superpoint aggregation for scene features

Contrastive learning-guided superpoint refinement

Relation-aware self-attention with positional geometry

🔎 Similar Papers

No similar papers found.