Alita-G: Self-Evolving Generative Agent for Agent Generation

πŸ“… 2025-10-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing self-evolving agents are largely confined to prompt rewriting or failure retrying, failing to achieve a substantive transition from general-purpose agents to high-precision domain experts. Method: This paper introduces a self-evolving framework grounded in the Model Context Protocol (MCP), systematically supporting the generation, abstraction, and reuse of domain expertise. It integrates memory augmentation, tool invocation, multi-source feedback, and retrieval-augmented MCP selection, coupled with a lightweight MCP executor to optimize inference paths. Contribution/Results: Evaluated on the GAIA benchmark, our approach achieves 83.03% pass@1 and 89.09% pass@3β€”surpassing prior methods while reducing computational cost by 15%. To our knowledge, this is the first framework enabling reusable and evolvable domain expert construction, establishing a novel paradigm for scalable, adaptive agent specialization.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) have been shown to perform better when scaffolded into agents with memory, tools, and feedback. Beyond this, self-evolving agents have emerged, but current work largely limits adaptation to prompt rewriting or failure retries. Therefore, we present ALITA-G, a self-evolution framework that transforms a general-purpose agent into a domain expert by systematically generating, abstracting, and curating Model Context Protocol (MCP) tools. In this framework, a generalist agent executes a curated suite of target-domain tasks and synthesizes candidate MCPs from successful trajectories. These are then abstracted to parameterized primitives and consolidated into an MCP Box. At inference time, ALITA-G performs retrieval-augmented MCP selection with the help of each tool's descriptions and use cases, before executing an agent equipped with the MCP Executor. Across several benchmarks GAIA, PathVQA, and Humanity's Last Exam, ALITA-G attains strong gains while reducing computation costs. On GAIA validation, it achieves 83.03% pass@1 and 89.09% pass@3, establishing a new state-of-the-art result while reducing mean tokens per example by approximately 15% relative to a strong baseline agent. ALITA-G thus provides a principled pathway from generalist capability to reusable, domain-specific competence, improving both accuracy and efficiency on complex reasoning tasks.
Problem

Research questions and friction points this paper is trying to address.

Transforming general agents into domain experts through systematic tool generation
Improving agent performance on complex reasoning tasks while reducing costs
Creating reusable domain-specific competence from general AI capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates and abstracts Model Context Protocol tools
Consolidates tools into retrievable MCP Box
Selects tools using retrieval-augmented MCP selection
πŸ”Ž Similar Papers
No similar papers found.