SignalClaw: LLM-Guided Evolutionary Synthesis of Interpretable Traffic Signal Control Skills

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the dual challenge of performance and interpretability in traffic signal control, where existing reinforcement learning approaches often lack transparency and program synthesis methods are constrained by domain-specific languages. The authors propose an interpretable skill-generation framework that integrates large language models with evolutionary algorithms, using natural language to guide the evolution of control policies. The resulting strategies are human-readable, comprising explicit rationales, logical structures, and executable code. An event-driven mechanism enables dynamic composition of skills at runtime without retraining. Implemented with TraCI-based event detection, a priority scheduler, and the SUMO simulation platform, the system achieves average delays of 7.8–9.2 seconds under normal conditions—approaching optimality—and significantly outperforms baselines in emergency and bus-priority scenarios, reducing delays for emergency vehicles and bus passengers to 11.2–18.5 seconds and 9.8–11.5 seconds, respectively.
📝 Abstract
Traffic signal control TSC requires strategies that are both effective and interpretable for deployment, yet reinforcement learning produces opaque neural policies while program synthesis depends on restrictive domain-specific languages. We present SIGNALCLAW, a framework that uses large language models LLMs as evolutionary skill generators to synthesize and refine interpretable control skills for adaptive TSC. Each skill includes rationale, selection guidance, and executable code, making policies human-inspectable and self-documenting. At each generation, evolution signals from simulation metrics such as queue percentiles, delay trends, and stagnation are translated into natural language feedback to guide improvement. SignalClaw also introduces event-driven compositional evolution: an event detector identifies emergency vehicles, transit priority, incidents, and congestion via TraCI, and a priority dispatcher selects specialized skills. Each skill is evolved independently, and a priority chain enables runtime composition without retraining. We evaluate SignalClaw on routine and event-injected SUMO scenarios against four baselines. On routine scenarios, it achieves average delay of 7.8 to 9.2 seconds, within 3 to 10 percent of the best method, with low variance across random seeds. Under event scenarios, it yields the lowest emergency delay 11.2 to 18.5 seconds versus 42.3 to 72.3 for MaxPressure and 78.5 to 95.3 for DQN, and the lowest transit person delay 9.8 to 11.5 seconds versus 38.7 to 45.2 for MaxPressure. In mixed events, the dispatcher composes skills effectively while maintaining stable overall delay. The evolved skills progress from simple linear rules to conditional strategies with multi-feature interactions, while remaining fully interpretable and directly modifiable by traffic engineers.
Problem

Research questions and friction points this paper is trying to address.

Traffic Signal Control
Interpretability
Reinforcement Learning
Program Synthesis
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided evolution
interpretable traffic signal control
event-driven composition
program synthesis
adaptive TSC
🔎 Similar Papers
No similar papers found.