🤖 AI Summary
To address the scarcity of labeled data for real-time, online semantic mapping in unknown environments using only onboard sensors in autonomous driving, this paper pioneers a systematic exploration of semi-supervised learning (SSL) for online semantic mapping. We propose a multi-view temporal pseudo-label aggregation framework grounded in consistency regularization, featuring a novel multi-sample confidence-weighted fusion mechanism that significantly enhances the reliability of pseudo-labels generated by the teacher model. Experiments demonstrate that our method achieves 96.5% of the fully supervised model’s performance (mIoU) using only 10% labeled data—yielding a mere 3.5 mIoU gap. Moreover, in cross-city domain adaptation (Argoverse 2 → Pittsburgh), incorporating unlabeled target-domain data reduces the performance gap from 5.0 to just 0.5 mIoU. These results validate the proposed approach’s dual advantages in generalization capability and labeling efficiency.
📝 Abstract
The ability to generate online maps using only onboard sensory information is crucial for enabling autonomous driving beyond well-mapped areas. Training models for this task -- predicting lane markers, road edges, and pedestrian crossings -- traditionally require extensive labelled data, which is expensive and labour-intensive to obtain. While semi-supervised learning (SSL) has shown promise in other domains, its potential for online mapping remains largely underexplored. In this work, we bridge this gap by demonstrating the effectiveness of SSL methods for online mapping. Furthermore, we introduce a simple yet effective method leveraging the inherent properties of online mapping by fusing the teacher's pseudo-labels from multiple samples, enhancing the reliability of self-supervised training. If 10% of the data has labels, our method to leverage unlabelled data achieves a 3.5x performance boost compared to only using the labelled data. This narrows the gap to a fully supervised model, using all labels, to just 3.5 mIoU. We also show strong generalization to unseen cities. Specifically, in Argoverse 2, when adapting to Pittsburgh, incorporating purely unlabelled target-domain data reduces the performance gap from 5 to 0.5 mIoU. These results highlight the potential of SSL as a powerful tool for solving the online mapping problem, significantly reducing reliance on labelled data.