🤖 AI Summary
Existing RL-based traffic signal control (TSC) methods suffer from high training costs and poor generalization, primarily due to strong environment dependency and slow convergence. This paper proposes FitLight, a federated imitation learning framework for multi-intersection TSC that enables plug-and-play deployment of agents in unseen environments. Its core contributions are: (1) the first dynamic switching mechanism between real-time imitation learning and reinforcement learning; (2) a lightweight agent design driven by hybrid pressure modeling; and (3) heterogeneous model aggregation with structured pruning, optimized for edge devices. Experiments on both real-world and synthetic datasets demonstrate that FitLight outperforms state-of-the-art methods in final performance and achieves significantly faster convergence. Moreover, it operates efficiently on ultra-constrained microcontrollers with only 16 KB RAM and 32 KB ROM, enabling practical edge deployment.
📝 Abstract
Although Reinforcement Learning (RL)-based Traffic Signal Control (TSC) methods have been extensively studied, their practical applications still raise some serious issues such as high learning cost and poor generalizability. This is because the ``trial-and-error'' training style makes RL agents extremely dependent on the specific traffic environment, which also requires a long convergence time. To address these issues, we propose a novel Federated Imitation Learning (FIL)-based framework for multi-intersection TSC, named FitLight, which allows RL agents to plug-and-play for any traffic environment without additional pre-training cost. Unlike existing imitation learning approaches that rely on pre-training RL agents with demonstrations, FitLight allows real-time imitation learning and seamless transition to reinforcement learning. Due to our proposed knowledge-sharing mechanism and novel hybrid pressure-based agent design, RL agents can quickly find a best control policy with only a few episodes. Moreover, for resource-constrained TSC scenarios, FitLight supports model pruning and heterogeneous model aggregation, such that RL agents can work on a micro-controller with merely 16{it KB} RAM and 32{it KB} ROM. Extensive experiments demonstrate that, compared to state-of-the-art methods, FitLight not only provides a superior starting point but also converges to a better final solution on both real-world and synthetic datasets, even under extreme resource limitations.