Behavior-Grounded Lane Representation Learning for Multi-Task Traffic Digital Twins

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing traffic digital twin systems rely on static geometric representations, which struggle to capture the dynamic functional semantics of lanes under complex traffic conditions, thereby limiting behavioral awareness and reasoning capabilities. To address this, this work proposes GeoLaneRep, a novel framework that, for the first time, integrates lane behavioral semantics into representation learning by jointly encoding lane geometry, vehicle trajectories, and operational descriptions to construct cross-camera shared semantic embeddings. This approach bridges the semantic gap between perception and generation, enabling zero-shot cross-camera matching, anomaly detection, and goal-directed lane geometry synthesis. Evaluated on 16 roadside cameras and 132 lanes, the method achieves superior performance: a lateral ranking error of 0.004, an edge-role F1 score of 1.000, an anomaly detection AUROC of 0.991, and a rule-compliance accuracy of 87.9%.

📝 Abstract

Traffic digital twins are powerful tools for advanced traffic management, and most systems are built on static geometric representations. However, these representations fail to capture the dynamic functional semantics required for behavior-aware reasoning, such as how a lane operates under complex traffic conditions. To address this gap, we introduce GeoLaneRep, a behavior-grounded lane representation learning framework for traffic digital twins. GeoLaneRep jointly encodes static lane geometry, observed vehicle trajectories, and operational descriptors into a shared, cross-camera semantic embedding. The encoder is trained with a joint objective combining contrastive cross-camera alignment, auxiliary role supervision, and temporal anomaly detection. Across 16 roadside cameras and 132 lanes, the learned embeddings achieve a $0.004$ lateral-rank error and an edge-role F1 of $1.000$ in zero-shot cross-camera matching, and an AUROC of $0.991$ for window-level anomaly detection. We further show that the same behavioral embeddings can condition a diffusion-based generator to synthesize lane geometries that satisfy targeted operational specifications, with $87.9\%$ overall specification accuracy across 38 lane groups. GeoLaneRep thus provides a semantic interface between roadside observations and downstream digital twin tasks, supporting cross-camera transfer, behavior-aware monitoring, and goal-directed lane synthesis. The framework is openly available at https://github.com/raynbowy23/GeoLaneRep.

Problem

Research questions and friction points this paper is trying to address.

traffic digital twins

lane representation

behavior-aware reasoning

dynamic semantics

functional semantics

Innovation

Methods, ideas, or system contributions that make the work stand out.

behavior-grounded representation

traffic digital twins

cross-camera embedding