WebWorld: A Large-Scale World Model for Web Agent Training

📅 2026-02-16

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the significant challenges posed by latency, bandwidth limitations, and security risks in real-world network environments, which severely hinder large-scale training of web-based agents. To overcome these constraints, we introduce WebWorld—the first large-scale simulator designed for open-web interaction—featuring a scalable data pipeline that generates over one million realistic interaction trajectories and supports multi-format state representations and long-horizon reasoning. We achieve the first large-scale training of world models in open web environments, establish WebWorld-Bench as a comprehensive evaluation framework, and demonstrate strong cross-domain generalization across code, GUI, and gaming tasks. Experiments show that Qwen3-14B fine-tuned on WebWorld-synthesized data improves performance by 9.2% on WebArena, matching GPT-4o, while surpassing GPT-5 in search-based reasoning and achieving simulation efficiency comparable to Gemini-3-Pro.

Technology Category

Application Category

📝 Abstract

Web agents require massive trajectories to generalize, yet real-world training is constrained by network latency, rate limits, and safety risks. We introduce \textbf{WebWorld} series, the first open-web simulator trained at scale. While existing simulators are restricted to closed environments with thousands of trajectories, WebWorld leverages a scalable data pipeline to train on 1M+ open-web interactions, supporting reasoning, multi-format data, and long-horizon simulations of 30+ steps. For intrinsic evaluation, we introduce WebWorld-Bench with dual metrics spanning nine dimensions, where WebWorld achieves simulation performance comparable to Gemini-3-Pro. For extrinsic evaluation, Qwen3-14B trained on WebWorld-synthesized trajectories improves by +9.2\% on WebArena, reaching performance comparable to GPT-4o. WebWorld enables effective inference-time search, outperforming GPT-5 as a world model. Beyond web simulation, WebWorld exhibits cross-domain generalization to code, GUI, and game environments, providing a replicable recipe for world model construction.

Problem

Research questions and friction points this paper is trying to address.

web agents

training trajectories

network latency

rate limits

safety risks

Innovation

Methods, ideas, or system contributions that make the work stand out.

web simulator

world model

large-scale training