FlashEvolve: Accelerating Agent Self-Evolution with Asynchronous Stage Orchestration

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the high latency in self-evolution processes of large language model (LLM) agents caused by synchronous execution and intra-stage load imbalance. The authors propose an asynchronous stage orchestration mechanism that decouples evolutionary stages via workers and queues, enabling inter-stage overlapping execution. To mitigate data staleness inherent in asynchronous pipelines, they introduce artifact version tracking and a language-space staleness repair strategy, complemented by speculative stage completion and adaptive workflow control. Evaluated on the GEPA benchmark, the approach achieves a 3.5× throughput improvement in a local vLLM environment and a 4.9× gain under API-based serving. Furthermore, it generalizes effectively to ACE and Meta-Harness frameworks, significantly enhancing both token efficiency and system throughput.

📝 Abstract

LLM-based evolution has emerged as a promising way to improve agents by refining non-parametric artifacts, but its wall-clock cost remains a major bottleneck. We identify that this cost comes from synchronized stage execution and imbalance inside each LLM-heavy stage. We present FlashEvolve, an efficient framework that replaces synchronized execution with asynchronous workers and queues, allowing different stages and steps to overlap. To handle data staleness introduced by asynchrony, FlashEvolve tracks artifact versions and applies different policies to update, discard, or patch stale artifacts. Unlike weight-space staleness in asynchronous RL, language-space staleness is inspectable and repairable: a stale artifact is not just delayed work, but readable evidence that the LLM can reflect on, revise, and turn into useful evolution signal. FlashEvolve further improves throughput and token efficiency with speculative stage completion and adaptive workflow control. On GEPA workloads, FlashEvolve improves proposal throughput by $3.5\times$ on local vLLM and $4.9\times$ on API serving over synchronous GEPA. The same design also applies to ACE and Meta-Harness.

Problem

Research questions and friction points this paper is trying to address.

LLM-based evolution

wall-clock cost

synchronized execution

stage imbalance

agent self-evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

asynchronous orchestration

agent self-evolution

artifact staleness handling