Learning to Design City-scale Transit Routes

📅 2025-12-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Urban bus network design is an NP-hard problem, and conventional manual planning struggles with large-scale solution spaces. This paper proposes the first end-to-end reinforcement learning (RL) framework for automated bus network design, leveraging a Graph Attention Network (GAT) to generate routes sequentially. It introduces a novel two-level reward mechanism—integrating topological incremental feedback with simulation-based terminal rewards—to effectively address long-horizon credit assignment. The framework combines the Proximal Policy Optimization (PPO) algorithm with the Multi-Agent Transport Simulation (MATSim) platform and employs census-driven demand modeling. It is the first RL-based approach validated at real-city scale (Bloomington, IN). Empirical results show that the RL-designed network achieves a 25.6% higher service coverage, a 30.9% reduction in average waiting time, and a 21.0% improvement in vehicle utilization compared to the existing real-world network; against state-of-the-art heuristic methods, it delivers a 68.8% gain in routing efficiency.

Technology Category

Application Category

📝 Abstract
Designing efficient transit route networks is an NP-hard problem with exponentially large solution spaces that traditionally relies on manual planning processes. We present an end-to-end reinforcement learning (RL) framework based on graph attention networks for sequential transit network construction. To address the long-horizon credit assignment challenge, we introduce a two-level reward structure combining incremental topological feedback with simulation-based terminal rewards. We evaluate our approach on a new real-world dataset from Bloomington, Indiana with topologically accurate road networks, census-derived demand, and existing transit routes. Our learned policies substantially outperform existing designs and traditional heuristics across two initialization schemes and two modal-split scenarios. Under high transit adoption with transit center initialization, our approach achieves 25.6% higher service rates, 30.9% shorter wait times, and 21.0% better bus utilization compared to the real-world network. Under mixed-mode conditions with random initialization, it delivers 68.8% higher route efficiency than demand coverage heuristics and 5.9% lower travel times than shortest path construction. These results demonstrate that end-to-end RL can design transit networks that substantially outperform both human-designed systems and hand-crafted heuristics on realistic city-scale benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Designing efficient transit route networks using reinforcement learning
Addressing long-horizon credit assignment with two-level reward structure
Outperforming human-designed systems and heuristics on city-scale benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end reinforcement learning framework using graph attention networks
Two-level reward structure with incremental and terminal feedback
Outperforms human designs and heuristics on realistic city-scale benchmarks