π€ AI Summary
Existing large language model (LLM)-driven program evolution methods lack principled design and convergence guarantees. This work reframes program search as sampling from a reward-guided target distribution and introduces a sequential Monte Carlo (SMC)-based automated evolution framework featuring three key innovations: adaptive parent resampling, a mutation mixture strategy with an acceptance criterion, and automatic convergence control. The proposed approach provides theoretical convergence guarantees under finite sample budgets, substantially reduces the number of LLM invocations, and consistently outperforms existing evolutionary systems across diverse tasksβincluding mathematical discovery, algorithm optimization, symbolic regression, and end-to-end machine learning.
π Abstract
LLM-driven program evolution has emerged as a powerful tool for automated scientific discovery, yet existing frameworks offer no principled guide for designing their individual components and provide no guarantee that the search converges. We introduce SMCEvolve, which recasts program search as sampling from a reward-tilted target distribution and approximates it with a Sequential Monte Carlo (SMC) sampler. From this view, three core mechanisms emerge as principled components: adaptive parent resampling, mixture of mutation with acceptance, and automatic convergence control. We further provide a finite-sample complexity analysis that bounds the LLM-call budget required to reach a target approximation error. Across math, algorithm efficiency, symbolic regression, and end-to-end ML research benchmarks, SMCEvolve surpasses state-of-the-art evolving systems while using fewer LLM calls under self-determined termination. The code is available at https://github.com/kongwanbianjinyu/SMCEvolve.