World-Model-Augmented Web Agents with Action Correction

📅 2026-02-17

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses the vulnerability of existing large language model–based web agents to failure under high-risk operations due to their limited awareness of dynamic environmental changes and execution risks. To mitigate this, the authors propose the WAC agent, which integrates an action model with a world model to enable robust, risk-aware decision-making. The approach leverages consequence simulation and a discriminative feedback–driven action refinement mechanism, augmented by a multi-agent collaboration framework and a two-stage chain-of-thought reasoning process. This design significantly enhances the agent’s ability to understand and adapt to state transitions in web environments. Empirical evaluations demonstrate consistent performance gains, with absolute improvements of 1.8% on VisualWebArena and 1.3% on Online-Mind2Web benchmarks.

Technology Category

Application Category

📝 Abstract

Web agents based on large language models have demonstrated promising capability in automating web tasks. However, current web agents struggle to reason out sensible actions due to the limitations of predicting environment changes, and might not possess comprehensive awareness of execution risks, prematurely performing risky actions that cause losses and lead to task failure. To address these challenges, we propose WAC, a web agent that integrates model collaboration, consequence simulation, and feedback-driven action refinement. To overcome the cognitive isolation of individual models, we introduce a multi-agent collaboration process that enables an action model to consult a world model as a web-environment expert for strategic guidance; the action model then grounds these suggestions into executable actions, leveraging prior knowledge of environmental state transition dynamics to enhance candidate action proposal. To achieve risk-aware resilient task execution, we introduce a two-stage deduction chain. A world model, specialized in environmental state transitions, simulates action outcomes, which a judge model then scrutinizes to trigger action corrective feedback when necessary. Experiments show that WAC achieves absolute gains of 1.8% on VisualWebArena and 1.3% on Online-Mind2Web.

Problem

Research questions and friction points this paper is trying to address.

web agents

action reasoning

environment prediction

execution risk

task failure

Innovation

Methods, ideas, or system contributions that make the work stand out.

world model

action correction

multi-agent collaboration