SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

📅 2025-08-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-based agents excel at complex reasoning and tool use, yet fail to effectively leverage feedback embedded in their interaction trajectories. Traditional search methods—e.g., Monte Carlo Tree Search (MCTS)—overlook inter-trajectory dependencies and suffer from monotonous search spaces, leading to redundant reasoning and suboptimal solutions. To address this, we propose a three-stage self-evolution mechanism—“Revise–Recombine–Refine”—that models trajectory correlations via cross-trajectory heuristic backtracking and iterative optimization, thereby enhancing search diversity and coherence. Our method tightly integrates the LLM’s reasoning capability with environmental interaction feedback to enable continual refinement of multi-step reasoning processes. Evaluated on the SWE-bench Verified benchmark, our approach achieves a 55% relative performance gain over prior open-source agents, establishing the new state-of-the-art.

Technology Category

Application Category

📝 Abstract
Large Language Model (LLM)-based agents have recently shown impressive capabilities in complex reasoning and tool use via multi-step interactions with their environments. While these agents have the potential to tackle complicated tasks, their problem-solving process, i.e., agents' interaction trajectory leading to task completion, remains underexploited. These trajectories contain rich feedback that can navigate agents toward the right directions for solving problems correctly. Although prevailing approaches, such as Monte Carlo Tree Search (MCTS), can effectively balance exploration and exploitation, they ignore the interdependence among various trajectories and lack the diversity of search spaces, which leads to redundant reasoning and suboptimal outcomes. To address these challenges, we propose SE-Agent, a Self-Evolution framework that enables Agents to optimize their reasoning processes iteratively. Our approach revisits and enhances former pilot trajectories through three key operations: revision, recombination, and refinement. This evolutionary mechanism enables two critical advantages: (1) it expands the search space beyond local optima by intelligently exploring diverse solution paths guided by previous trajectories, and (2) it leverages cross-trajectory inspiration to efficiently enhance performance while mitigating the impact of suboptimal reasoning paths. Through these mechanisms, SE-Agent achieves continuous self-evolution that incrementally improves reasoning quality. We evaluate SE-Agent on SWE-bench Verified to resolve real-world GitHub issues. Experimental results across five strong LLMs show that integrating SE-Agent delivers up to 55% relative improvement, achieving state-of-the-art performance among all open-source agents on SWE-bench Verified. Our code and demonstration materials are publicly available at https://github.com/wanghuacan/SE-Agent.
Problem

Research questions and friction points this paper is trying to address.

Optimizing multi-step reasoning in LLM-based agents
Enhancing trajectory diversity and interdependence in search
Improving reasoning quality through self-evolution mechanisms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Evolution framework optimizes reasoning iteratively
Enhances trajectories via revision, recombination, refinement
Expands search space with diverse solution paths
🔎 Similar Papers
No similar papers found.