CleanMAP: Distilling Multimodal LLMs for Confidence-Driven Crowdsourced HD Map Updates

📅 2025-04-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low reliability of HD map updates caused by motion blur, illumination variations, adverse weather, and lane degradation in crowdsourced data, this paper proposes a confidence-driven multimodal distillation framework. The method introduces a lane visibility scoring model and a dynamic segment-wise confidence function, and designs a top-k local map fusion strategy constrained by confidence intervals to achieve an optimal trade-off between accuracy and data volume. Leveraging a multimodal large language model (MLLM), it enables cross-modal feature alignment and uncertainty modeling. Evaluated on real-world vehicle datasets, the approach reduces the average update error to 0.28 m—surpassing the baseline by 0.37 m—and satisfies the stringent ≤0.32 m requirement. It achieves an 84.88% agreement rate with human annotations, significantly enhancing both the accuracy and real-time capability of HD map updates.

Technology Category

Application Category

📝 Abstract
The rapid growth of intelligent connected vehicles (ICVs) and integrated vehicle-road-cloud systems has increased the demand for accurate, real-time HD map updates. However, ensuring map reliability remains challenging due to inconsistencies in crowdsourced data, which suffer from motion blur, lighting variations, adverse weather, and lane marking degradation. This paper introduces CleanMAP, a Multimodal Large Language Model (MLLM)-based distillation framework designed to filter and refine crowdsourced data for high-confidence HD map updates. CleanMAP leverages an MLLM-driven lane visibility scoring model that systematically quantifies key visual parameters, assigning confidence scores (0-10) based on their impact on lane detection. A novel dynamic piecewise confidence-scoring function adapts scores based on lane visibility, ensuring strong alignment with human evaluations while effectively filtering unreliable data. To further optimize map accuracy, a confidence-driven local map fusion strategy ranks and selects the top-k highest-scoring local maps within an optimal confidence range (best score minus 10%), striking a balance between data quality and quantity. Experimental evaluations on a real-world autonomous vehicle dataset validate CleanMAP's effectiveness, demonstrating that fusing the top three local maps achieves the lowest mean map update error of 0.28m, outperforming the baseline (0.37m) and meeting stringent accuracy thresholds (<= 0.32m). Further validation with real-vehicle data confirms 84.88% alignment with human evaluators, reinforcing the model's robustness and reliability. This work establishes CleanMAP as a scalable and deployable solution for crowdsourced HD map updates, ensuring more precise and reliable autonomous navigation. The code will be available at https://Ankit-Zefan.github.io/CleanMap/
Problem

Research questions and friction points this paper is trying to address.

Filtering unreliable crowdsourced HD map data
Scoring lane visibility for confidence-driven updates
Optimizing map accuracy via top-k local fusion
Innovation

Methods, ideas, or system contributions that make the work stand out.

MLLM-based distillation framework for HD map updates
Dynamic piecewise confidence-scoring function for lane visibility
Confidence-driven local map fusion strategy for accuracy
🔎 Similar Papers
No similar papers found.
A
Ankit Kumar Shaw
Tsinghua University
Kun Jiang
Kun Jiang
Tsinghua University
autonomous driving
T
Tuopu Wen
Tsinghua University
Chandan Kumar Sah
Chandan Kumar Sah
MTech. (Research) student at Indian Institute of Science
Data-driven controlKoopman Operator TheoryMulti-agent systemsReinforcement learning
Y
Yining Shi
Tsinghua University
M
Mengmeng Yang
Tsinghua University
D
Diange Yang
Tsinghua University
X
Xiaoli Lian
Beihang University