Scalable and Reliable Multi-agent Reinforcement Learning for Traffic Assignment

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor scalability and unstable convergence of multi-agent reinforcement learning (MARL) in large-scale urban traffic assignment, this paper proposes MARL-OD-DA—a novel MARL framework where origin-destination (OD) pairs serve as autonomous agents. We introduce an innovative OD-router agent paradigm, design a Dirichlet-distribution-based action space with dynamic action pruning, and formulate a local relative gap reward function to enhance solution reliability and accelerate convergence. Experiments on the Sioux Falls network demonstrate that MARL-OD-DA converges within only 10 training episodes and reduces the relative gap by 94.99% compared to conventional methods—significantly outperforming existing MARL approaches. By shifting from individual-traveler to OD-pair-level modeling, our framework overcomes the scalability bottleneck inherent in fine-grained agent representations. This work establishes a new, efficient, and scalable multi-agent paradigm for large-scale traffic assignment.

Technology Category

Application Category

📝 Abstract
The evolution of metropolitan cities and the increase in travel demands impose stringent requirements on traffic assignment methods. Multi-agent reinforcement learning (MARL) approaches outperform traditional methods in modeling adaptive routing behavior without requiring explicit system dynamics, which is beneficial for real-world deployment. However, MARL frameworks face challenges in scalability and reliability when managing extensive networks with substantial travel demand, which limiting their practical applicability in solving large-scale traffic assignment problems. To address these challenges, this study introduces MARL-OD-DA, a new MARL framework for the traffic assignment problem, which redefines agents as origin-destination (OD) pair routers rather than individual travelers, significantly enhancing scalability. Additionally, a Dirichlet-based action space with action pruning and a reward function based on the local relative gap are designed to enhance solution reliability and improve convergence efficiency. Experiments demonstrate that the proposed MARL framework effectively handles medium-sized networks with extensive and varied city-level OD demand, surpassing existing MARL methods. When implemented in the SiouxFalls network, MARL-OD-DA achieves better assignment solutions in 10 steps, with a relative gap that is 94.99% lower than that of conventional methods.
Problem

Research questions and friction points this paper is trying to address.

Scalability issues in MARL for large traffic networks
Reliability challenges in MARL-based traffic assignment solutions
Efficient convergence in MARL frameworks for city-level demand
Innovation

Methods, ideas, or system contributions that make the work stand out.

OD pair routers redefine agents for scalability
Dirichlet-based action space enhances reliability
Local relative gap reward improves convergence
🔎 Similar Papers
No similar papers found.