EvolveSearch: An Iterative Self-Evolving Search Agent

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) for open-domain web search face critical challenges—including heavy reliance on human-annotated data, scarcity of supervised fine-tuning (SFT) examples, and premature convergence in reinforcement learning (RL), leading to inefficient data utilization. Method: We propose the first SFT–RL co-evolutionary framework, enabling iterative self-improvement without any human-annotated reasoning data. Guided by multi-hop question answering (MHQA) tasks, the framework autonomously refines a search agent’s capabilities within open web environments. Contribution/Results: Evaluated across seven MHQA benchmarks, our method consistently outperforms state-of-the-art approaches by an average of 4.7%, significantly enhancing multi-step reasoning for web search. It establishes a scalable, low-dependency paradigm for LLM-driven autonomous search, eliminating manual annotation while synergizing SFT and RL for robust, iterative capability evolution.

Technology Category

Application Category

📝 Abstract
The rapid advancement of large language models (LLMs) has transformed the landscape of agentic information seeking capabilities through the integration of tools such as search engines and web browsers. However, current mainstream approaches for enabling LLM web search proficiency face significant challenges: supervised fine-tuning struggles with data production in open-search domains, while RL converges quickly, limiting their data utilization efficiency. To address these issues, we propose EvolveSearch, a novel iterative self-evolution framework that combines SFT and RL to enhance agentic web search capabilities without any external human-annotated reasoning data. Extensive experiments on seven multi-hop question-answering (MHQA) benchmarks demonstrate that EvolveSearch consistently improves performance across iterations, ultimately achieving an average improvement of 4.7% over the current state-of-the-art across seven benchmarks, opening the door to self-evolution agentic capabilities in open web search domains.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM web search without human-annotated data
Improving data utilization efficiency in open-search domains
Achieving self-evolution in agentic web search capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative self-evolution framework combining SFT and RL
No external human-annotated reasoning data required
Improves performance across multi-hop QA benchmarks
🔎 Similar Papers
No similar papers found.