Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

📅 2025-11-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing LLM agents rely on human-annotated data for reinforcement learning, suffering from poor scalability and inherent limitations of human knowledge; self-evolution frameworks are often confined to single-turn interactions and static model capabilities, failing to support tool-augmented reasoning and complex, progressive curriculum evolution. Method: We propose a dual-agent symbiotic competition framework, where a single large language model is instantiated as both a curriculum agent and an execution agent. Through multi-round collaborative evolution, the framework autonomously generates hierarchical task curricula and integrates external tools—including mathematical computation and code execution—to enable dynamic reasoning and capability leaps. Evolution is fully autonomous, driven solely by intrinsic rewards without any human-provided data. Contribution/Results: Evaluated on Qwen3-8B-Base, our framework achieves +18% improvement in mathematical reasoning accuracy and +24% in general reasoning performance, significantly advancing autonomous learning and complex problem-solving capabilities of LLM agents.

Technology Category

Application Category

📝 Abstract

Large Language Model (LLM) Agents, often trained with Reinforcement Learning (RL), are constrained by a dependency on human-curated data, limiting scalability and tethering AI to human knowledge. Existing self-evolution frameworks offer an alternative but are typically restricted by the model's inherent capabilities and single-round interactions, hindering the development of complex curricula involving tool use or dynamic reasoning. We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data through multi-step co-evolution and seamless tool integration. Agent0 establishes a symbiotic competition between two agents initialized from the same base LLM: a curriculum agent that proposes increasingly challenging frontier tasks, and an executor agent that learns to solve them. We integrate external tools to enhance the executor's problem-solving capacity; this improvement, in turn, pressures the curriculum agent to construct more complex, tool-aware tasks. Through this iterative process, Agent0 establishes a self-reinforcing cycle that continuously produces high-quality curricula. Empirically, Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks. Code is available at https://github.com/aiming-lab/Agent0.

Problem

Research questions and friction points this paper is trying to address.

Developing autonomous agents without human-curated data dependency

Overcoming limitations of single-round interactions in self-evolution frameworks

Enhancing reasoning capabilities through tool-integrated multi-step co-evolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-evolving agents via multi-step co-evolution

Seamless tool integration for enhanced problem-solving

Symbiotic competition between curriculum and executor agents

🔎 Similar Papers

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement