🤖 AI Summary
Existing LLM-based agent frameworks rely on manually predefined tools and workflows, severely limiting cross-domain adaptability, scalability, and generalization. Method: We propose Alita, a lightweight, universal agent architecture grounded in a “minimal predefinition + maximal self-evolution” paradigm. Alita employs only a single core component—a large language model–driven meta-reasoning framework—that autonomously generates, refines, and reuses open-source Model Context Protocols (MCPs) as shareable, executable capability carriers, enabling zero-hard-coded capability expansion. It eliminates conventional toolchains and supports end-to-end, cross-domain autonomous evolution. Contribution/Results: Evaluated on GAIA, MathVista, and PathVQA, Alita achieves pass@1 scores of 75.15%, 74.00%, and 52.00%, respectively—surpassing state-of-the-art complex agent systems and establishing a new GAIA SOTA.
📝 Abstract
Recent advances in large language models (LLMs) have enabled agents to autonomously perform complex, open-ended tasks. However, many existing frameworks depend heavily on manually predefined tools and workflows, which hinder their adaptability, scalability, and generalization across domains. In this work, we introduce Alita--a generalist agent designed with the principle of"Simplicity is the ultimate sophistication,"enabling scalable agentic reasoning through minimal predefinition and maximal self-evolution. For minimal predefinition, Alita is equipped with only one component for direct problem-solving, making it much simpler and neater than previous approaches that relied heavily on hand-crafted, elaborate tools and workflows. This clean design enhances its potential to generalize to challenging questions, without being limited by tools. For Maximal self-evolution, we enable the creativity of Alita by providing a suite of general-purpose components to autonomously construct, refine, and reuse external capabilities by generating task-related model context protocols (MCPs) from open source, which contributes to scalable agentic reasoning. Notably, Alita achieves 75.15% pass@1 and 87.27% pass@3 accuracy, which is top-ranking among general-purpose agents, on the GAIA benchmark validation dataset, 74.00% and 52.00% pass@1, respectively, on Mathvista and PathVQA, outperforming many agent systems with far greater complexity. More details will be updated at $href{https://github.com/CharlesQ9/Alita}{https://github.com/CharlesQ9/Alita}$.