🤖 AI Summary
Black-box neural network policies in dynamic system control suffer from poor interpretability and verifiability. Method: This paper proposes a large language model (LLM)-guided symbolic program evolution framework that automatically synthesizes Python-based, interpretable, and formally verifiable control policies. The approach integrates pre-trained LLMs, evolutionary search, physics-based simulation evaluation, and program synthesis to generate end-to-end transparent controllers in standard, executable code. Contribution/Results: To our knowledge, this is the first work leveraging LLMs to guide symbolic-space evolution of control policies—balancing expressive power with human readability. Evaluated on the cart-pole swing-up and ball-in-cup tasks, the method achieves high control performance while ensuring transparency, reliability, and task adaptability. Open-sourced implementation confirms its advantages over conventional neural policies in terms of explainability, formal verification feasibility, and generalization across control tasks.
📝 Abstract
The combination of Large Language Models (LLMs), systematic evaluation, and evolutionary algorithms has enabled breakthroughs in combinatorial optimization and scientific discovery. We propose to extend this powerful combination to the control of dynamical systems, generating interpretable control policies capable of complex behaviors. With our novel method, we represent control policies as programs in standard languages like Python. We evaluate candidate controllers in simulation and evolve them using a pre-trained LLM. Unlike conventional learning-based control techniques, which rely on black box neural networks to encode control policies, our approach enhances transparency and interpretability. We still take advantage of the power of large AI models, but leverage it at the policy design phase, ensuring that all system components remain interpretable and easily verifiable at runtime. Additionally, the use of standard programming languages makes it straightforward for humans to finetune or adapt the controllers based on their expertise and intuition. We illustrate our method through its application to the synthesis of an interpretable control policy for the pendulum swing-up and the ball in cup tasks. We make the code available at https://github.com/muellerlab/synthesizing_interpretable_control_policies.git