ItTakesTwo: Leveraging Peer Representations for Semi-supervised LiDAR Semantic Segmentation

📅 2024-07-09
🏛️ European Conference on Computer Vision
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
To address two key bottlenecks in semi-supervised LiDAR semantic segmentation—insufficient single-view consistency perturbation and contrastive learning’s reliance on locally constrained positive/negative sample selection—this paper proposes ItTakesTwo, a novel semi-supervised framework. Methodologically: (1) it constructs dual-view representations (e.g., voxel-based and point-level features) and enforces co-optimized consistency constraints to enhance robustness against perturbations; (2) it introduces a distribution-aware global embedding sampling strategy that dynamically selects the most discriminative positive and negative samples across the entire embedding space, thereby improving contrastive learning efficacy. Evaluated on multiple public benchmarks, ItTakesTwo achieves state-of-the-art performance using significantly fewer annotations (e.g., only 10% labeled data), with substantial gains in segmentation accuracy and cross-scenario generalization capability.

Technology Category

Application Category

📝 Abstract
The costly and time-consuming annotation process to produce large training sets for modelling semantic LiDAR segmentation methods has motivated the development of semi-supervised learning (SSL) methods. However, such SSL approaches often concentrate on employing consistency learning only for individual LiDAR representations. This narrow focus results in limited perturbations that generally fail to enable effective consistency learning. Additionally, these SSL approaches employ contrastive learning based on the sampling from a limited set of positive and negative embedding samples. This paper introduces a novel semi-supervised LiDAR semantic segmentation framework called ItTakesTwo (IT2). IT2 is designed to ensure consistent predictions from peer LiDAR representations, thereby improving the perturbation effectiveness in consistency learning. Furthermore, our contrastive learning employs informative samples drawn from a distribution of positive and negative embeddings learned from the entire training set. Results on public benchmarks show that our approach achieves remarkable improvements over the previous state-of-the-art (SOTA) methods in the field. The code is available at: https://github.com/yyliu01/IT2.
Problem

Research questions and friction points this paper is trying to address.

Reducing costly LiDAR annotation via semi-supervised learning
Improving consistency learning with peer LiDAR representations
Enhancing contrastive learning with full training set embeddings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensures consistent predictions from peer LiDAR representations
Improves perturbation effectiveness in consistency learning
Uses informative samples from learned embedding distribution
🔎 Similar Papers
No similar papers found.