🤖 AI Summary
This work addresses a critical limitation in existing traffic forecasting benchmarks, which assume a static sensor network and thus fail to capture the reality of incrementally expanding deployments over time. To bridge this gap, the authors introduce XXLTraffic, a dataset spanning 27 years of traffic data from California and New South Wales, and propose EvoXXLTraffic—the first benchmark that explicitly supports dynamic sensor evolution. The benchmark defines an evaluation protocol based on annual streaming tasks. Through systematic evaluation of representative approaches—including static spatiotemporal graph neural networks, online learning, evolutionary graph continual learning, and test-time adaptation—the study reveals a significant performance degradation of current state-of-the-art models under evolving conditions. These findings underscore the greater realism and challenge posed by EvoXXLTraffic, encouraging the community to advance toward more practical long-term traffic forecasting scenarios.
📝 Abstract
Existing traffic forecasting benchmarks assume a fixed sensor set, but real road-sensor networks grow continuously as the road network changes year by year. We introduce the XXLTraffic dataset family, which spans up to 27 years of California PeMS and Transport for NSW data. The fixed-sensor subsets of XXLTraffic support extremely long forecasting with multi-year gaps and standard hourly / daily long-horizon forecasting. We extend it to EvoXXLTraffic, a sensor-evolving reorganization that exposes per-year active sensors, yearly traffic-flow matrices, and yearly graph snapshots across nine PeMS districts, with growth ratios ranging from +305% to over +10,000%. We define a yearly streaming forecasting protocol on EvoXXLTraffic in which each calendar year is a continual task, and benchmark a wide range of representative baselines drawn from static spatio-temporal GNNs, naïve online schemes, evolving-graph continual methods, and retrieval / test-time methods. We find that our ultra-large evolutionary dataset better reflects the real world, and many state-of-the-art (SOTA) results no longer work. Our dataset complements existing benchmarks by enabling more realistic forecasting under ultra-long evolutionary road networks.