SMCEvolve: Principled Scientific Discovery via Sequential Monte Carlo Evolution

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Existing large language model (LLM)-driven program evolution methods lack principled design and convergence guarantees. This work reframes program search as sampling from a reward-guided target distribution and introduces a sequential Monte Carlo (SMC)-based automated evolution framework featuring three key innovations: adaptive parent resampling, a mutation mixture strategy with an acceptance criterion, and automatic convergence control. The proposed approach provides theoretical convergence guarantees under finite sample budgets, substantially reduces the number of LLM invocations, and consistently outperforms existing evolutionary systems across diverse tasks—including mathematical discovery, algorithm optimization, symbolic regression, and end-to-end machine learning.

📝 Abstract

LLM-driven program evolution has emerged as a powerful tool for automated scientific discovery, yet existing frameworks offer no principled guide for designing their individual components and provide no guarantee that the search converges. We introduce SMCEvolve, which recasts program search as sampling from a reward-tilted target distribution and approximates it with a Sequential Monte Carlo (SMC) sampler. From this view, three core mechanisms emerge as principled components: adaptive parent resampling, mixture of mutation with acceptance, and automatic convergence control. We further provide a finite-sample complexity analysis that bounds the LLM-call budget required to reach a target approximation error. Across math, algorithm efficiency, symbolic regression, and end-to-end ML research benchmarks, SMCEvolve surpasses state-of-the-art evolving systems while using fewer LLM calls under self-determined termination. The code is available at https://github.com/kongwanbianjinyu/SMCEvolve.

Problem

Research questions and friction points this paper is trying to address.

scientific discovery

program evolution

convergence guarantee

LLM-driven automation

principled design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential Monte Carlo

Program Evolution

LLM-driven Discovery