🤖 AI Summary
Existing reinforcement learning-based traffic signal control (RL-TSC) methods lack systematic robustness evaluation under real-world disturbances such as traffic accidents.
Method: We propose T-REX, an open-source simulation framework integrating SUMO-based microscopic traffic simulation, probabilistic route reassignment, and adaptive car-following models. It introduces the first comprehensive robustness evaluation metric suite specifically designed for traffic incidents.
Contribution/Results: Through benchmarking three mainstream RL-TSC paradigms—independent value-function, pressure-driven, and hierarchical cooperative—we uncover fundamental performance trade-offs under distributional shift. Hierarchical cooperative methods achieve superior robustness in large-scale, irregular networks (performance degradation <15% during incidents) but converge slowly; independent and pressure-based methods excel in steady-state conditions yet suffer severe degradation (>40%) during incidents. T-REX enables reproducible benchmarking and establishes a standardized experimental platform for advancing robust RL-TSC research.
📝 Abstract
Reinforcement learning-based traffic signal control (RL-TSC) has emerged as a promising approach for improving urban mobility. However, its robustness under real-world disruptions such as traffic incidents remains largely underexplored. In this study, we introduce T-REX, an open-source, SUMO-based simulation framework for training and evaluating RL-TSC methods under dynamic, incident scenarios. T-REX models realistic network-level performance considering drivers' probabilistic rerouting, speed adaptation, and contextual lane-changing, enabling the simulation of congestion propagation under incidents. To assess robustness, we propose a suite of metrics that extend beyond conventional traffic efficiency measures. Through extensive experiments across synthetic and real-world networks, we showcase T-REX for the evaluation of several state-of-the-art RL-TSC methods under multiple real-world deployment paradigms. Our findings show that while independent value-based and decentralized pressure-based methods offer fast convergence and generalization in stable traffic conditions and homogeneous networks, their performance degrades sharply under incident-driven distribution shifts. In contrast, hierarchical coordination methods tend to offer more stable and adaptable performance in large-scale, irregular networks, benefiting from their structured decision-making architecture. However, this comes with the trade-off of slower convergence and higher training complexity. These findings highlight the need for robustness-aware design and evaluation in RL-TSC research. T-REX contributes to this effort by providing an open, standardized and reproducible platform for benchmarking RL methods under dynamic and disruptive traffic scenarios.