World-Model-Augmented Web Agents with Action Correction

📅 2026-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of existing large language model–based web agents to failure under high-risk operations due to their limited awareness of dynamic environmental changes and execution risks. To mitigate this, the authors propose the WAC agent, which integrates an action model with a world model to enable robust, risk-aware decision-making. The approach leverages consequence simulation and a discriminative feedback–driven action refinement mechanism, augmented by a multi-agent collaboration framework and a two-stage chain-of-thought reasoning process. This design significantly enhances the agent’s ability to understand and adapt to state transitions in web environments. Empirical evaluations demonstrate consistent performance gains, with absolute improvements of 1.8% on VisualWebArena and 1.3% on Online-Mind2Web benchmarks.

Technology Category

Application Category

📝 Abstract
Web agents based on large language models have demonstrated promising capability in automating web tasks. However, current web agents struggle to reason out sensible actions due to the limitations of predicting environment changes, and might not possess comprehensive awareness of execution risks, prematurely performing risky actions that cause losses and lead to task failure. To address these challenges, we propose WAC, a web agent that integrates model collaboration, consequence simulation, and feedback-driven action refinement. To overcome the cognitive isolation of individual models, we introduce a multi-agent collaboration process that enables an action model to consult a world model as a web-environment expert for strategic guidance; the action model then grounds these suggestions into executable actions, leveraging prior knowledge of environmental state transition dynamics to enhance candidate action proposal. To achieve risk-aware resilient task execution, we introduce a two-stage deduction chain. A world model, specialized in environmental state transitions, simulates action outcomes, which a judge model then scrutinizes to trigger action corrective feedback when necessary. Experiments show that WAC achieves absolute gains of 1.8% on VisualWebArena and 1.3% on Online-Mind2Web.
Problem

Research questions and friction points this paper is trying to address.

web agents
action reasoning
environment prediction
execution risk
task failure
Innovation

Methods, ideas, or system contributions that make the work stand out.

world model
action correction
multi-agent collaboration
risk-aware execution
web automation
🔎 Similar Papers
No similar papers found.
Z
Zhouzhou Shen
Zhejiang University
X
Xueyu Hu
Zhejiang University
X
Xiyun Li
Tencent AI Lab
Tianqing Fang
Tianqing Fang
Tencent AI Lab
Natural Language ProcessingAgentLanguage Models
Juncheng Li
Juncheng Li
East China Normal University
Super ResolutionImage RestorationComputer VisionMedical Image Analysis
S
Shengyu Zhang
Zhejiang University