MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Traditional high-definition (HD) map construction methods produce only deterministic point estimates, failing to explicitly model environmental uncertainty arising from occlusions, sensor dropouts, or partial observability. To address this, we propose MapDiffusion—the first framework leveraging generative diffusion models for online vectorized HD map generation. MapDiffusion introduces a conditional diffusion architecture built upon a bird’s-eye-view (BEV) latent grid and incorporates stochastically initialized queries, enabling iterative refinement from noise to generate multimodal map sequences while explicitly quantifying spatial uncertainty. Evaluated on nuScenes, MapDiffusion achieves a 5% mAP improvement over strong baselines under single-sample inference; multi-sample aggregation further boosts ROC-AUC significantly. Crucially, its uncertainty heatmaps exhibit strong spatial alignment with ground-truth occluded regions, demonstrating both physical interpretability and reliability in uncertainty estimation.

Technology Category

Application Category

📝 Abstract

Autonomous driving requires an understanding of the static environment from sensor data. Learned Bird's-Eye View (BEV) encoders are commonly used to fuse multiple inputs, and a vector decoder predicts a vectorized map representation from the latent BEV grid. However, traditional map construction models provide deterministic point estimates, failing to capture uncertainty and the inherent ambiguities of real-world environments, such as occlusions and missing lane markings. We propose MapDiffusion, a novel generative approach that leverages the diffusion paradigm to learn the full distribution of possible vectorized maps. Instead of predicting a single deterministic output from learned queries, MapDiffusion iteratively refines randomly initialized queries, conditioned on a BEV latent grid, to generate multiple plausible map samples. This allows aggregating samples to improve prediction accuracy and deriving uncertainty estimates that directly correlate with scene ambiguity. Extensive experiments on the nuScenes dataset demonstrate that MapDiffusion achieves state-of-the-art performance in online map construction, surpassing the baseline by 5% in single-sample performance. We further show that aggregating multiple samples consistently improves performance along the ROC curve, validating the benefit of distribution modeling. Additionally, our uncertainty estimates are significantly higher in occluded areas, reinforcing their value in identifying regions with ambiguous sensor input. By modeling the full map distribution, MapDiffusion enhances the robustness and reliability of online vectorized HD map construction, enabling uncertainty-aware decision-making for autonomous vehicles in complex environments.

Problem

Research questions and friction points this paper is trying to address.

Captures uncertainty in vectorized HD map construction

Generates multiple plausible map samples from BEV data

Improves prediction accuracy via sample aggregation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative diffusion for vectorized HD maps

Iterative refinement of random map queries

Uncertainty estimation via sample aggregation

🔎 Similar Papers

Online Temporal Fusion for Vectorized Map Construction in Mapless Autonomous Driving