Towards Autonomous Railway Operations: A Semi-Hierarchical Deep Reinforcement Learning Approach to the Vehicle Rescheduling Problem

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This study addresses the vehicle rescheduling problem under traffic disruptions in dense railway networks, where traditional operations research and existing reinforcement learning approaches face limitations in real-time responsiveness, scalability, and multi-agent coordination. To overcome these challenges, the authors propose a semi-hierarchical deep reinforcement learning architecture tailored to railway operational constraints, decoupling scheduling decisions from path planning. The framework employs hierarchical action and observation spaces to separately handle sparse high-level scheduling commands and frequent low-level path updates. Experiments on the Flatland-RL platform demonstrate that, across scenarios involving 7 to 80 trains, five difficulty levels, and 50 random seeds, the proposed method nearly doubles the number of trains successfully reaching their destinations, maintains deadlock rates below 5%, and adaptively executes scheduling actions such as reordering, delaying, or canceling trains, thereby significantly enhancing collaborative efficiency and robustness in complex congestion scenarios.

📝 Abstract

Managing disruptions in railway traffic management is a major challenge. Rising traffic density and infrastructure limits increase complexity, making the Vehicle Routing and Scheduling Problem (VRSP) difficult to solve reliably and in real time. While Operational Research (OR) methods are widely used, most dispatching still relies on human expertise due to the problem's exponential combinatorial complexity. Reinforcement Learning (RL) has gained attention for its potential in multi-agent coordination, but existing RL approaches often underperform OR methods and struggle to scale in dense rail networks. This paper addresses this gap from a machine learning perspective by introducing a semi-hierarchical RL formulation tailored to operational railway constraints. The method separates dispatching from routing through dedicated action and observation spaces, enabling policies to specialise in distinct decision scopes and addressing the imbalance between rare dispatch decisions and frequent routing updates. The approach is evaluated on the Flatland-RL simulator across five difficulty levels and 50 random seeds, with 7 to 80 trains. Results show substantially improved coordination, resource utilisation, and robustness compared with heuristic baselines and monolithic RL, nearly doubling the number of trains reaching their destinations, while keeping deadlock rates below 5% and adaptively sequencing, delaying, or cancelling trains under heavy congestion.

Problem

Research questions and friction points this paper is trying to address.

Vehicle Rescheduling Problem

Railway Traffic Management

Real-time Decision Making

Combinatorial Complexity

Autonomous Dispatching

Innovation

Methods, ideas, or system contributions that make the work stand out.

semi-hierarchical reinforcement learning

vehicle rescheduling problem

railway traffic management