🤖 AI Summary
This work addresses the significant challenges posed by latency, bandwidth limitations, and security risks in real-world network environments, which severely hinder large-scale training of web-based agents. To overcome these constraints, we introduce WebWorld—the first large-scale simulator designed for open-web interaction—featuring a scalable data pipeline that generates over one million realistic interaction trajectories and supports multi-format state representations and long-horizon reasoning. We achieve the first large-scale training of world models in open web environments, establish WebWorld-Bench as a comprehensive evaluation framework, and demonstrate strong cross-domain generalization across code, GUI, and gaming tasks. Experiments show that Qwen3-14B fine-tuned on WebWorld-synthesized data improves performance by 9.2% on WebArena, matching GPT-4o, while surpassing GPT-5 in search-based reasoning and achieving simulation efficiency comparable to Gemini-3-Pro.
📝 Abstract
Web agents require massive trajectories to generalize, yet real-world training is constrained by network latency, rate limits, and safety risks. We introduce \textbf{WebWorld} series, the first open-web simulator trained at scale. While existing simulators are restricted to closed environments with thousands of trajectories, WebWorld leverages a scalable data pipeline to train on 1M+ open-web interactions, supporting reasoning, multi-format data, and long-horizon simulations of 30+ steps. For intrinsic evaluation, we introduce WebWorld-Bench with dual metrics spanning nine dimensions, where WebWorld achieves simulation performance comparable to Gemini-3-Pro. For extrinsic evaluation, Qwen3-14B trained on WebWorld-synthesized trajectories improves by +9.2\% on WebArena, reaching performance comparable to GPT-4o. WebWorld enables effective inference-time search, outperforming GPT-5 as a world model. Beyond web simulation, WebWorld exhibits cross-domain generalization to code, GUI, and game environments, providing a replicable recipe for world model construction.