๐ค AI Summary
This work addresses the limited representational capacity of existing traffic signal control methods, which often leads to suboptimal performance and poor generalization in complex, dynamic environments. To overcome this limitation, the authors propose LATS, a novel framework that, for the first time, integrates semantic priors from large language models (LLMs) into multi-agent reinforcement learning via a plug-in teacherโstudent knowledge distillation mechanism. This approach generates semantically rich representations of intersection topology and traffic dynamics, thereby enhancing the decision-making capabilities of lightweight agents. Extensive experiments demonstrate that LATS significantly outperforms both conventional reinforcement learning baselines and pure LLM-based methods across multiple traffic datasets, achieving a favorable balance between strong representational power and efficient inference while substantially improving both the performance and generalization of traffic signal control policies.
๐ Abstract
Adaptive Traffic Signal Control (ATSC) aims to optimize traffic flow and minimize delays by adjusting traffic lights in real time. Recent advances in Multi-agent Reinforcement Learning (MARL) have shown promise for ATSC, yet existing approaches still suffer from limited representational capacity, often leading to suboptimal performance and poor generalization in complex and dynamic traffic environments. On the other hand, Large Language Models (LLMs) excel at semantic representation, reasoning, and analysis, yet their propensity for hallucination and slow inference speeds often hinder their direct application to decision-making tasks. To address these challenges, we propose a novel learning paradigm named LATS that integrates LLMs and MARL, leveraging the former's strong prior knowledge and inductive abilities to enhance the latter's decision-making process. Specifically, we introduce a plug-and-play teacher-student learning module, where a trained embedding LLM serves as a teacher to generate rich semantic features that capture each intersection's topology structures and traffic dynamics. A much simpler (student) neural network then learns to emulate these features through knowledge distillation in the latent space, enabling the final model to operate independently from the LLM for downstream use in the RL decision-making process. This integration significantly enhances the overall model's representational capacity across diverse traffic scenarios, thus leading to more efficient and generalizable control strategies. Extensive experiments across diverse traffic datasets empirically demonstrate that our method enhances the representation learning capability of RL models, thereby leading to improved overall performance and generalization over both traditional RL and LLM-only approaches. [...]