🤖 AI Summary
Existing road topology inference methods typically model lane detection and topological relationships separately, neglecting critical lane-to-traffic-element (L2T) interactions and exhibiting limited relational modeling capacity. To address this, we propose a relation-aware joint framework featuring a novel lane detector equipped with a dual-path topology head: (i) geometry-enhanced lane-to-lane (L2L) reasoning and (ii) cross-view lane-to-traffic-element (L2T) modeling. Our design integrates geometric-bias self-attention, curve-aware cross-attention, and multi-scale relational encoding. Additionally, we employ contrastive learning (InfoNCE) to regularize relation embeddings. Evaluated on OpenLane-V2, our method achieves state-of-the-art performance: +3.1 in lane detection (DETₗ), +5.3 in L2L topology accuracy (TOPₗₗ), +4.9 in L2T topology accuracy (TOPₗₜ), and +4.4 in overall lane structure (OLS) score—demonstrating significant improvements in both detection fidelity and topological reasoning.
📝 Abstract
Accurate road topology reasoning is critical for autonomous driving, enabling effective navigation and adherence to traffic regulations. Central to this task are lane perception and topology reasoning. However, existing methods typically focus on either lane detection or Lane-to-Lane (L2L) topology reasoning, often extit{neglecting} Lane-to-Traffic-element (L2T) relationships or extit{failing} to optimize these tasks jointly. Furthermore, most approaches either overlook relational modeling or apply it in a limited scope, despite the inherent spatial relationships among road elements. We argue that relational modeling is beneficial for both perception and reasoning, as humans naturally leverage contextual relationships for road element recognition and their connectivity inference. To this end, we introduce relational modeling into both perception and reasoning, extit{jointly} enhancing structural understanding. Specifically, we propose: 1) a relation-aware lane detector, where our geometry-biased self-attention and curve cross-attention refine lane representations by capturing relational dependencies; 2) relation-enhanced topology heads, including a geometry-enhanced L2L head and a cross-view L2T head, boosting reasoning with relational cues; and 3) a contrastive learning strategy with InfoNCE loss to regularize relationship embeddings. Extensive experiments on OpenLane-V2 demonstrate that our approach significantly improves both detection and topology reasoning metrics, achieving +3.1 in DET$_l$, +5.3 in TOP$_{ll}$, +4.9 in TOP$_{lt}$, and an overall +4.4 in OLS, setting a new state-of-the-art. Code will be released.