LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-quality UI trajectory data for training digital agents is scarce, and manual annotation or real-world collection is prohibitively expensive. Method: This paper proposes UI-Simulator-Grow—a large language model (LLM)-driven digital world simulator that integrates guided rollout exploration and trajectory encapsulation to autonomously generate large-scale, diverse, structured UI state-transition sequences. It employs a targeted expansion strategy to prioritize high-value task trajectories and enables efficient data augmentation on small models (e.g., Llama-3-8B). Contribution/Results: Experiments show that agents trained with UI-Simulator-Grow achieve performance on par with or surpassing open-source agents trained on real UI data—matching the capability of Llama-3-70B-based agents—on WebArena and AndroidWorld benchmarks. The method significantly improves generalization and training efficiency while reducing reliance on costly human-annotated or environment-collected data.

Technology Category

Application Category

📝 Abstract
Digital agents require diverse, large-scale UI trajectories to generalize across real-world tasks, yet collecting such data is prohibitively expensive in both human annotation, infra and engineering perspectives. To this end, we introduce $ extbf{UI-Simulator}$, a scalable paradigm that generates structured UI states and transitions to synthesize training trajectories at scale. Our paradigm integrates a digital world simulator for diverse UI states, a guided rollout process for coherent exploration, and a trajectory wrapper that produces high-quality and diverse trajectories for agent training. We further propose $ extbf{UI-Simulator-Grow}$, a targeted scaling strategy that enables more rapid and data-efficient scaling by prioritizing high-impact tasks and synthesizes informative trajectory variants. Experiments on WebArena and AndroidWorld show that UI-Simulator rivals or surpasses open-source agents trained on real UIs with significantly better robustness, despite using weaker teacher models. Moreover, UI-Simulator-Grow matches the performance of Llama-3-70B-Instruct using only Llama-3-8B-Instruct as the base model, highlighting the potential of targeted synthesis scaling paradigm to continuously and efficiently enhance the digital agents.
Problem

Research questions and friction points this paper is trying to address.

Generating scalable UI trajectories for agent training
Reducing data collection costs for digital agents
Enhancing agent robustness through synthetic UI simulations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates structured UI states and transitions
Integrates digital world simulator with guided rollout
Prioritizes high-impact tasks for targeted scaling
🔎 Similar Papers
No similar papers found.
Y
Yiming Wang
Harvard University
Da Yin
Da Yin
Meta FAIR
Natural Language Processing
Y
Yuedong Cui
UCLA
R
Ruichen Zheng
UCLA
Z
Zhiqian Li
UCLA
Zongyu Lin
Zongyu Lin
UCLA
Large Foundation ModelPretrainingReasoning
D
Di Wu
UCLA
Xueqing Wu
Xueqing Wu
UCLA
Chenchen Ye
Chenchen Ye
UCLA
Y
Yu Zhou
UCLA
K
Kai-Wei Chang
UCLA