🤖 AI Summary
To address the poor generalizability of autonomous driving systems caused by heavy reliance on high-definition (HD) maps, this paper proposes an online lane topology consistency modeling framework that operates without HD maps. Methodologically, it jointly leverages standard-definition map priors and onboard sensor observations within a unified neural architecture—incorporating hybrid lane-segment encoding, denoising-based supervision, and explicit inter-frame temporal association—to jointly and coherently predict lane segments, boundaries, and topological relationships. The core contribution lies in the organic integration of prior-guided learning, noise-robust optimization, and temporal consistency enforcement, thereby overcoming the instability inherent in conventional single-frame prediction paradigms. Evaluated on mainstream benchmarks, our approach achieves state-of-the-art performance, improving lane topology prediction accuracy by 12.7% over prior methods—demonstrating the feasibility of robust, temporally consistent environmental understanding in the HD-map-free paradigm.
📝 Abstract
Most autonomous cars rely on the availability of high-definition (HD) maps. Current research aims to address this constraint by directly predicting HD map elements from onboard sensors and reasoning about the relationships between the predicted map and traffic elements. Despite recent advancements, the coherent online construction of HD maps remains a challenging endeavor, as it necessitates modeling the high complexity of road topologies in a unified and consistent manner. To address this challenge, we propose a coherent approach to predict lane segments and their corresponding topology, as well as road boundaries, all by leveraging prior map information represented by commonly available standard-definition (SD) maps. We propose a network architecture, which leverages hybrid lane segment encodings comprising prior information and denoising techniques to enhance training stability and performance. Furthermore, we facilitate past frames for temporal consistency. Our experimental evaluation demonstrates that our approach outperforms previous methods by a large margin, highlighting the benefits of our modeling scheme.