Dual-branch Spatial-Temporal Self-supervised Representation for Enhanced Road Network Learning

📅 2025-11-10

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing GNN- and contrastive learning–based approaches for road network representation struggle to jointly capture spatial heterogeneity and temporal dynamics, primarily due to the oversimplified neighborhood smoothing assumption. To address this, we propose DST, a dual-branch spatiotemporal self-supervised framework. First, DST constructs three types of hyperedges to form a hypergraph, enabling hypergraph contrastive learning to model complex, heterogeneous spatial relationships. Second, it designs a hybrid hop-aware transition matrix to encode trajectory dynamics and incorporates a causal Transformer—distinguished by weekday/weekend contexts—for time-series forecasting, thereby achieving traffic-pattern-aware zero-shot generalization. Evaluated on multiple benchmarks, DST significantly outperforms state-of-the-art methods, especially under zero-shot settings, demonstrating superior robustness. These results validate both the effectiveness and necessity of joint spatiotemporal modeling for road representation learning.

Technology Category

Application Category

📝 Abstract

Road network representation learning (RNRL) has attracted increasing attention from both researchers and practitioners as various spatiotemporal tasks are emerging. Recent advanced methods leverage Graph Neural Networks (GNNs) and contrastive learning to characterize the spatial structure of road segments in a self-supervised paradigm. However, spatial heterogeneity and temporal dynamics of road networks raise severe challenges to the neighborhood smoothing mechanism of self-supervised GNNs. To address these issues, we propose a $ extbf{D}$ual-branch $ extbf{S}$patial-$ extbf{T}$emporal self-supervised representation framework for enhanced road representations, termed as DST. On one hand, DST designs a mix-hop transition matrix for graph convolution to incorporate dynamic relations of roads from trajectories. Besides, DST contrasts road representations of the vanilla road network against that of the hypergraph in a spatial self-supervised way. The hypergraph is newly built based on three types of hyperedges to capture long-range relations. On the other hand, DST performs next token prediction as the temporal self-supervised task on the sequences of traffic dynamics based on a causal Transformer, which is further regularized by differentiating traffic modes of weekdays from those of weekends. Extensive experiments against state-of-the-art methods verify the superiority of our proposed framework. Moreover, the comprehensive spatiotemporal modeling facilitates DST to excel in zero-shot learning scenarios.

Problem

Research questions and friction points this paper is trying to address.

Addressing spatial heterogeneity challenges in road network representation learning

Capturing temporal dynamics of road networks through self-supervised learning

Overcoming neighborhood smoothing limitations in graph neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mix-hop transition matrix for dynamic road relations

Hypergraph contrastive learning for long-range spatial relations

Causal Transformer with traffic mode regularization for temporal prediction

🔎 Similar Papers

Translating Images to Road Network: A Sequence-to-Sequence Perspective