๐ค AI Summary
In large-scale heterogeneous traffic networks, multi-agent reinforcement learning (MARL) suffers from high communication overhead due to centralized data sharing and inefficient coordination caused by strong intersection heterogeneity. To address these challenges, this paper proposes a hierarchical federated reinforcement learning framework. It employs structure-aware dynamic clustering to adaptively group intersections, followed by lightweight federated averaging (FedAvg) within each groupโbalancing model personalization and collaborative optimization. The method integrates traffic flow modeling, dynamic clustering, and MARL to significantly reduce both communication and computational costs. Extensive experiments on synthetic and real-world road networks demonstrate that our approach outperforms decentralized MARL and standard federated RL baselines in terms of average delay, throughput, and robustness. Moreover, it automatically adapts to varying network topologies and traffic distributions, enhancing system scalability and generalization capability.
๐ Abstract
Multi-agent reinforcement learning (MARL) has shown promise for adaptive traffic signal control (ATSC), enabling multiple intersections to coordinate signal timings in real time. However, in large-scale settings, MARL faces constraints due to extensive data sharing and communication requirements. Federated learning (FL) mitigates these challenges by training shared models without directly exchanging raw data, yet traditional FL methods such as FedAvg struggle with highly heterogeneous intersections. Different intersections exhibit varying traffic patterns, demands, and road structures, so performing FedAvg across all agents is inefficient. To address this gap, we propose Hierarchical Federated Reinforcement Learning (HFRL) for ATSC. HFRL employs clustering-based or optimization-based techniques to dynamically group intersections and perform FedAvg independently within groups of intersections with similar characteristics, enabling more effective coordination and scalability than standard FedAvg. Our experiments on synthetic and real-world traffic networks demonstrate that HFRL not only outperforms both decentralized and standard federated RL approaches but also identifies suitable grouping patterns based on network structure or traffic demand, resulting in a more robust framework for distributed, heterogeneous systems.