🤖 AI Summary
This work addresses the challenge that current large language model agents struggle to evolve continuously through lifelong interaction, primarily due to their reliance solely on successful trajectories while neglecting failure experiences, as well as degrading retrieval efficiency and context window overload as experience accumulates. To overcome these limitations, the paper proposes a novel self-evolution framework that integrates a contrastive reflection strategy with a self-consolidation mechanism. The former extracts error patterns and reusable insights by contrasting successful and failed trajectories, while the latter distills non-parametric textual experiences into compact, learnable parameters internalized within the model’s latent space. Experiments demonstrate that this approach significantly enhances the agent’s long-term performance and stability without expanding the context window, effectively mitigating issues of experience noise and efficiency bottlenecks.
📝 Abstract
While large language model (LLM) agents have demonstrated impressive problem-solving capabilities, they typically operate as static systems, lacking the ability to evolve through lifelong interaction. Existing attempts to bridge this gap primarily rely on retrieving successful past trajectories as demonstrations. However, this paradigm faces two critical limitations. First, by focusing solely on success, agents overlook the rich pedagogical value embedded in failed attempts, preventing them from identifying and avoiding recurrent pitfalls. Second, continually accumulating textual experiences not only increases the time consumption during retrieval but also inevitably introduces noise and exhausts the largest context window of current LLMs. To address these challenges, we propose a novel self-evolving framework for LLM agents that introduces a complementary evolution mechanism: First, a contrastive reflection strategy is introduced to explicitly summarize error-prone patterns and capture reusable insights. Second, we propose a self-consolidation mechanism that distills non-parametric textual experience into compact learnable parameters. This enables the agent to internalize extensive historical experience directly into its latent space. Extensive experiments demonstrate the advantages of our method in long-term agent evolution.