Network-Constrained Policy Optimization for Adaptive Multi-agent Vehicle Routing

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address path homogenization-induced congestion in dynamic multi-vehicle routing for urban environments—where shortest-path-first (SPF) algorithms often fail—the paper proposes a cooperative navigation framework based on multi-agent reinforcement learning. The method innovatively integrates graph attention networks (GATs) to jointly model local and neighborhood traffic states, introduces an adaptive navigation policy coupled with a hierarchical hub control mechanism, and employs a centralized-training-with-decentralized-execution (CTDE) architecture enhanced by Attentive Q-Mixing (A-QMIX) for traffic-aware global coordination. Evaluated on both synthetic and real-world road networks (Toronto and Manhattan), the approach reduces average travel time by up to 15.9% compared to SPF and state-of-the-art learning baselines, while maintaining 100% route success rate. It demonstrates superior scalability and congestion mitigation capability in large-scale urban transportation systems.

Technology Category

Application Category

📝 Abstract
Traffic congestion in urban road networks leads to longer trip times and higher emissions, especially during peak periods. While the Shortest Path First (SPF) algorithm is optimal for a single vehicle in a static network, it performs poorly in dynamic, multi-vehicle settings, often worsening congestion by routing all vehicles along identical paths. We address dynamic vehicle routing through a multi-agent reinforcement learning (MARL) framework for coordinated, network-aware fleet navigation. We first propose Adaptive Navigation (AN), a decentralized MARL model where each intersection agent provides routing guidance based on (i) local traffic and (ii) neighborhood state modeled using Graph Attention Networks (GAT). To improve scalability in large networks, we further propose Hierarchical Hub-based Adaptive Navigation (HHAN), an extension of AN that assigns agents only to key intersections (hubs). Vehicles are routed hub-to-hub under agent control, while SPF handles micro-routing within each hub region. For hub coordination, HHAN adopts centralized training with decentralized execution (CTDE) under the Attentive Q-Mixing (A-QMIX) framework, which aggregates asynchronous vehicle decisions via attention. Hub agents use flow-aware state features that combine local congestion and predictive dynamics for proactive routing. Experiments on synthetic grids and real urban maps (Toronto, Manhattan) show that AN reduces average travel time versus SPF and learning baselines, maintaining 100% routing success. HHAN scales to networks with hundreds of intersections, achieving up to 15.9% improvement under heavy traffic. These findings highlight the potential of network-constrained MARL for scalable, coordinated, and congestion-aware routing in intelligent transportation systems.
Problem

Research questions and friction points this paper is trying to address.

Optimizing multi-vehicle routing to reduce urban traffic congestion
Addressing scalability in large networks via hierarchical reinforcement learning
Coordinating fleet navigation using network-aware multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized MARL with Graph Attention Networks
Hierarchical hub routing with centralized training
Flow-aware state features for proactive congestion avoidance
🔎 Similar Papers
No similar papers found.