Safe and Scalable Web Agent Learning via Recreated Websites

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Training autonomous web agents in real-world website environments is severely hindered by their uncontrollable nature, difficulty of reset, and lack of verifiable feedback. To address this, this work proposes VeriEnv, a framework that leverages large language models to automatically clone real websites into executable, verifiable synthetic environments. VeriEnv provides controlled access to internal environment states through a Python SDK, enabling agents to self-generate tasks and receive programmatically defined reward signals. This facilitates safe, scalable, and self-evolving training. Experimental results demonstrate that agents trained with VeriEnv exhibit strong generalization to unseen websites, acquire site-specific skills, and consistently improve as the scale of the training environments increases.

Technology Category

Application Category

📝 Abstract
Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

web agent
safe exploration
verifiable feedback
environment reset
autonomous learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic environments
verifiable rewards
web agent learning
environment cloning
self-evolution
🔎 Similar Papers
No similar papers found.