MambaMap: Online Vectorized HD Map Construction using State Space Model

📅 2025-07-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Online high-definition (HD) map construction faces challenges of inefficient long-sequence temporal modeling and poor robustness to occlusions and sensor noise. To address these, we propose a vectorized online mapping framework based on State Space Models (SSMs). Our method introduces a gated SSM module coupled with a multi-directional spatiotemporal scanning strategy to efficiently fuse long-sequence bird’s-eye-view (BEV) features; incorporates a historical frame memory bank and a dynamic BEV update mechanism to enhance temporal consistency and interference resilience; and leverages instance query optimization for precise vector element generation. Extensive experiments on nuScenes and Argoverse2 demonstrate state-of-the-art performance—particularly in long-range perception and cross-dataset generalization—achieving significant improvements in both accuracy and robustness. The approach simultaneously delivers superior computational efficiency and mapping quality, validating its synergistic advantages.

Technology Category

Application Category

📝 Abstract
High-definition (HD) maps are essential for autonomous driving, as they provide precise road information for downstream tasks. Recent advances highlight the potential of temporal modeling in addressing challenges like occlusions and extended perception range. However, existing methods either fail to fully exploit temporal information or incur substantial computational overhead in handling extended sequences. To tackle these challenges, we propose MambaMap, a novel framework that efficiently fuses long-range temporal features in the state space to construct online vectorized HD maps. Specifically, MambaMap incorporates a memory bank to store and utilize information from historical frames, dynamically updating BEV features and instance queries to improve robustness against noise and occlusions. Moreover, we introduce a gating mechanism in the state space, selectively integrating dependencies of map elements in high computational efficiency. In addition, we design innovative multi-directional and spatial-temporal scanning strategies to enhance feature extraction at both BEV and instance levels. These strategies significantly boost the prediction accuracy of our approach while ensuring robust temporal consistency. Extensive experiments on the nuScenes and Argoverse2 datasets demonstrate that our proposed MambaMap approach outperforms state-of-the-art methods across various splits and perception ranges. Source code will be available at https://github.com/ZiziAmy/MambaMap.
Problem

Research questions and friction points this paper is trying to address.

Efficiently fusing long-range temporal features for HD maps
Reducing computational overhead in temporal sequence handling
Improving robustness against noise and occlusions in maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fuses long-range temporal features efficiently
Uses memory bank for historical frame information
Introduces gating mechanism for selective integration
🔎 Similar Papers
No similar papers found.
R
Ruizi Yang
College of Software Technology, Zhejiang University, Hangzhou 310027, China
Xiaolu Liu
Xiaolu Liu
Zhejiang University
Computer VisionAutonomous Driving
J
Junbo Chen
Udeer.ai, Hangzhou 310000, China
Jianke Zhu
Jianke Zhu
Professor of Computer Science, Zhejiang University
Computer VisionRobotics