🤖 AI Summary
Low-efficiency experience generation in Agentic AI training hinders rigorous evaluation on complex benchmarks such as GAIA. Method: We introduce AWorld, an open-source distributed interactive system featuring a scalable agent-environment co-architecture that integrates distributed task scheduling, clustered environment parallelism, and a Qwen3-32B–based reinforcement learning framework. Contribution/Results: AWorld accelerates experience collection by 14.6×. For the first time, it enables a Qwen3-32B–powered agent to surpass leading closed-source models on GAIA’s hardest tier: overall accuracy improves from 21.59% to 32.23%, and accuracy on the most challenging subset reaches 16.33%. This work establishes a complete training paradigm—“efficient interaction → high-quality experience → model evolution”—and provides a reproducible, scalable infrastructure for large-scale Agentic AI empirical learning.
📝 Abstract
The learning from practice paradigm is crucial for developing capable Agentic AI systems, yet it is severely hampered by inefficient experience generation, a bottleneck especially pronounced in complex benchmarks like GAIA. To address this, we introduce AWorld, an open-source system engineered for large-scale agent-environment interaction. By distributing tasks across a cluster, AWorld accelerates experience collection by 14.6x compared to standard single-node, sequential execution. This critical speedup makes extensive reinforcement learning practical and scalable. Leveraging this capability, we trained a Qwen3-32B-based agent that significantly outperforms its base model, increasing its overall GAIA accuracy from 21.59% to 32.23%. On the benchmark's most challenging levels, our agent achieves a score of 16.33%, surpassing the performance of leading proprietary models. Our open-source system and resulting agent provide a practical blueprint for a complete agentic AI training pipeline, from efficient interaction to demonstrable model improvement.