InfiniteWeb: Scalable Web Environment Synthesis for GUI Agent Training

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge of training GUI agents, which is hindered by the scarcity of realistic and interactive web environments. To overcome this limitation, the authors propose a task-driven automated approach that synthesizes large-scale, functionally complete, and interlinked web environments under a unified specification. Diversity is ensured by leveraging website seeds and reference design mockups. Additionally, the framework includes a verifiable task evaluator that provides dense reward signals to facilitate reinforcement learning. The generated environments exhibit higher realism than those produced by existing commercial coding agents, and GUI agents trained within them achieve significantly improved performance on the OSWorld and Online-Mind2Web benchmarks.

Technology Category

Application Category

📝 Abstract

GUI agents that interact with graphical interfaces on behalf of users represent a promising direction for practical AI assistants. However, training such agents is hindered by the scarcity of suitable environments. We present InfiniteWeb, a system that automatically generates functional web environments at scale for GUI agent training. While LLMs perform well on generating a single webpage, building a realistic and functional website with many interconnected pages faces challenges. We address these challenges through unified specification, task-centric test-driven development, and a combination of website seed with reference design image to ensure diversity. Our system also generates verifiable task evaluators enabling dense reward signals for reinforcement learning. Experiments show that InfiniteWeb surpasses commercial coding agents at realistic website construction, and GUI agents trained on our generated environments achieve significant performance improvements on OSWorld and Online-Mind2Web, demonstrating the effectiveness of proposed system.

Problem

Research questions and friction points this paper is trying to address.

GUI agent training

web environment synthesis

scalable environment generation

functional website generation

training data scarcity

Innovation

Methods, ideas, or system contributions that make the work stand out.

web environment synthesis

GUI agent training

test-driven development

LLM-based generation

dense reward signals

🔎 Similar Papers

NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild

2024-10-03Citations: 0

AgentStudio: A Toolkit for Building General Virtual Agents

2024-03-26arXiv.orgCitations: 8

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

2024-10-07arXiv.orgCitations: 17