🤖 AI Summary
Safety-critical systems operating in dynamic environments require continuous, adaptive decision-making without compromising reliability, interpretability, or generalizability.
Method: We propose TAPA—a novel framework that employs large language models (LLMs) as symbolic action-space coordinators, enabling online, modular program synthesis and optimization grounded in logical primitives. TAPA decouples high-level policy intent from low-level execution via dynamic program synthesis, requiring no fine-tuning or retraining. It supports dual-dimensional adaptation—both policy and action—and enables runtime evolution through symbolic logic, programmable agent architecture, and real-time feedback.
Results: Evaluated on cybersecurity (achieving 77.7% uptime under unknown dynamics and near-perfect detection accuracy) and swarm intelligence (maintaining strong runtime consensus under adversarial interference), TAPA significantly outperforms baselines. It enhances system reliability, interpretability, and cross-domain generalization while preserving formal guarantees inherent to symbolic reasoning.
📝 Abstract
Autonomous agents in safety-critical applications must continuously adapt to dynamic conditions without compromising performance and reliability. This work introduces TAPA (Training-free Adaptation of Programmatic Agents), a novel framework that positions large language models (LLMs) as intelligent moderators of the symbolic action space. Unlike prior programmatic agents that typically generate a monolithic policy program or rely on fixed symbolic action sets, TAPA synthesizes and adapts modular programs for individual high-level actions, referred to as logical primitives. By decoupling strategic intent from execution, TAPA enables meta-agents to operate over an abstract, interpretable action space while the LLM dynamically generates, composes, and refines symbolic programs tailored to each primitive. Extensive experiments across cybersecurity and swarm intelligence domains validate TAPA's effectiveness. In autonomous DDoS defense scenarios, TAPA achieves 77.7% network uptime while maintaining near-perfect detection accuracy in unknown dynamic environments. In swarm intelligence formation control under environmental and adversarial disturbances, TAPA consistently preserves consensus at runtime where baseline methods fail completely. This work promotes a paradigm shift for autonomous system design in evolving environments, from policy adaptation to dynamic action adaptation.