๐ค AI Summary
This work addresses the generalization bottleneck faced by static agents in complex codebases due to phase mismatch. To overcome this limitation, the authors propose EvE, a decentralized evolutionary ensemble framework that co-evolves functional code solvers and guiding agents. Within a synchronized competition setting, agents are dynamically re-ranked via Elo scores based on their marginal contributions, enabling adaptive navigation of evolving search spaces. The framework introduces a novel phase-dependent agent adaptation mechanism, providing the first demonstration that self-revising agent ensembles can surpass the performance ceiling of static strategies and effectively mitigate phase mismatch. Evaluated in the ICON environment, EvE autonomously discovers a โscale-then-interpolateโ heuristic, substantially improving generalization across varying numbers of examples and outperforming both fixed and optimally frozen agent baselines.
๐ Abstract
We introduce Evolutionary Ensemble (EvE), a decentralized framework that organizes existing, highly capable coding agents into a live, co-evolving system for algorithmic discovery. Rather than reinventing the wheel within the "LLMs as optimizers" paradigm, EvE fixes the base agent substrate and focuses entirely on evolving the cumulative guidance and skills that dictate agent behaviors. By maintaining two co-evolving populations, namely functional code solvers and agent guidance states, the system evaluates agents through a synchronous race, updating their empirical Elo ratings based on the marginal gains they contribute to the current solver state. When applied to a research bottleneck in In-Context Operator Networks (ICON), EvE autonomously discovered a robust rescale-then-interpolate mechanism that enables reliable example-count generalization. Crucially, controlled ablations reveal the absolute necessity of stage-dependent agent adaptation to navigate the shifting search landscapes of complex codebases. Compared to variants driven by a fixed initial agent or even a frozen "best-evolved" agent, EvE uniquely avoids phase mismatch, demonstrating that organizing agents into a self-revising ensemble is the fundamental driver for breaking through static performance ceilings.