🤖 AI Summary
Existing online mapping models heavily rely on costly, geographically limited high-definition (HD) map ground truth annotations, severely constraining generalization and large-scale deployment. To address this, we propose the first online vectorized mapping framework that requires no HD map supervision. Our method first fuses Gaussian splatting–reconstructed road geometry with outputs from a pre-trained 2D semantic segmentation network to generate high-quality, multi-view pseudo-labels. Second, we introduce a mask-aware matching strategy and corresponding loss function to explicitly model partially occluded regions. The framework enables end-to-end training on raw, unlabeled sensor data and supports semi-supervised pre-training. Experiments demonstrate significant improvements in cross-scene generalization. Code is publicly released, establishing a new paradigm for ground-truth-free mapping research.
📝 Abstract
Online mapping models show remarkable results in predicting vectorized maps from multi-view camera images only. However, all existing approaches still rely on ground-truth high-definition maps during training, which are expensive to obtain and often not geographically diverse enough for reliable generalization. In this work, we propose PseudoMapTrainer, a novel approach to online mapping that uses pseudo-labels generated from unlabeled sensor data. We derive those pseudo-labels by reconstructing the road surface from multi-camera imagery using Gaussian splatting and semantics of a pre-trained 2D segmentation network. In addition, we introduce a mask-aware assignment algorithm and loss function to handle partially masked pseudo-labels, allowing for the first time the training of online mapping models without any ground-truth maps. Furthermore, our pseudo-labels can be effectively used to pre-train an online model in a semi-supervised manner to leverage large-scale unlabeled crowdsourced data. The code is available at github.com/boschresearch/PseudoMapTrainer.