3DResT: A Strong Baseline for Semi-Supervised 3D Referring Expression Segmentation

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

143K/year

🤖 AI Summary

To address low pseudo-label utilization and the discarding of low-confidence yet discriminative information in semi-supervised 3D referring expression segmentation (3D-RES), this paper proposes the first dedicated semi-supervised framework for 3D-RES. Our method jointly leverages 3D point clouds and linguistic features, integrating consistency regularization with dynamic pseudo-label optimization. Key contributions include: (1) Teacher–Student Consistency Sampling (TSCS), which strengthens supervision signals via cross-view and cross-modal consistency; and (2) Quality-Driven Dynamic Weighting (QDW), which adaptively assigns weights to pseudo-labels based on local geometric–semantic confidence, thereby preserving informative yet low-quality predictions. Evaluated under extreme data scarcity (only 1% labeled data), our approach achieves an 8.34-point mIoU improvement over the fully supervised baseline—demonstrating the effectiveness of synergistically exploiting both high- and low-quality pseudo-labels.

Technology Category

Application Category

📝 Abstract

3D Referring Expression Segmentation (3D-RES) typically requires extensive instance-level annotations, which are time-consuming and costly. Semi-supervised learning (SSL) mitigates this by using limited labeled data alongside abundant unlabeled data, improving performance while reducing annotation costs. SSL uses a teacher-student paradigm where teacher generates high-confidence-filtered pseudo-labels to guide student. However, in the context of 3D-RES, where each label corresponds to a single mask and labeled data is scarce, existing SSL methods treat high-quality pseudo-labels merely as auxiliary supervision, which limits the model's learning potential. The reliance on high-confidence thresholds for filtering often results in potentially valuable pseudo-labels being discarded, restricting the model's ability to leverage the abundant unlabeled data. Therefore, we identify two critical challenges in semi-supervised 3D-RES, namely, inefficient utilization of high-quality pseudo-labels and wastage of useful information from low-quality pseudo-labels. In this paper, we introduce the first semi-supervised learning framework for 3D-RES, presenting a robust baseline method named 3DResT. To address these challenges, we propose two novel designs called Teacher-Student Consistency-Based Sampling (TSCS) and Quality-Driven Dynamic Weighting (QDW). TSCS aids in the selection of high-quality pseudo-labels, integrating them into the labeled dataset to strengthen the labeled supervision signals. QDW preserves low-quality pseudo-labels by dynamically assigning them lower weights, allowing for the effective extraction of useful information rather than discarding them. Extensive experiments conducted on the widely used benchmark demonstrate the effectiveness of our method. Notably, with only 1% labeled data, 3DResT achieves an mIoU improvement of 8.34 points compared to the fully supervised method.

Problem

Research questions and friction points this paper is trying to address.

Reducing annotation costs in 3D-RES with semi-supervised learning

Improving pseudo-label utilization in teacher-student SSL frameworks

Enhancing model performance with limited labeled 3D-RES data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Teacher-Student Consistency-Based Sampling for pseudo-labels

Quality-Driven Dynamic Weighting for low-quality labels

Semi-supervised 3D-RES framework with limited labeled data

🔎 Similar Papers

No similar papers found.