WebUncertainty: Dual-Level Uncertainty Driven Planning and Reasoning For Autonomous Web Agent

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Current autonomous web agents are limited in complex, dynamic, and long-horizon tasks due to rigid planning and reasoning hallucinations. This work proposes WebUncertainty, a novel framework that introduces a dual uncertainty-driven mechanism operating at both task and action levels. At the task level, it enables adaptive planning, while at the action level, it integrates Monte Carlo Tree Search (MCTS) with a Confidence-induced Action Uncertainty (ConActU) strategy to jointly quantify epistemic and aleatoric uncertainties, thereby enhancing reasoning robustness. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art models on the WebArena and WebVoyager benchmarks, exhibiting superior task completion capability and adaptability.

Technology Category

Application Category

📝 Abstract

Recent advancements in large language models (LLMs) have empowered autonomous web agents to execute natural language instructions directly on real-world webpages. However, existing agents often struggle with complex tasks involving dynamic interactions and long-horizon execution due to rigid planning strategies and hallucination-prone reasoning. To address these limitations, we propose WebUncertainty, a novel autonomous agent framework designed to tackle dual-level uncertainty in planning and reasoning. Specifically, we design a Task Uncertainty-Driven Adaptive Planning Mechanism that adaptively selects planning modes to navigate unknown environments. Furthermore, we introduce an Action Uncertainty-Driven Monte Carlo tree search (MCTS) Reasoning Mechanism. This mechanism incorporates the Confidence-induced Action Uncertainty (ConActU) strategy to quantify both aleatoric uncertainty (AU) and epistemic uncertainty (EU), thereby optimizing the search process and guiding robust decision-making. Experimental results on the WebArena and WebVoyager benchmarks demonstrate that WebUncertainty achieves superior performance compared to state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

autonomous web agent

planning uncertainty

reasoning uncertainty

hallucination

complex web tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-level uncertainty

adaptive planning

Monte Carlo tree search