WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) deployed as web agents exhibit insufficient reasoning capabilities and poor robustness in dynamic web environments. Method: This paper proposes a reasoning-skill-customized enhancement framework. Its core innovation is the first explicit reformulation of web interaction reasoning as a chain-of-thought (CoT) paradigm, distilled from real-world interaction trajectories into three key reasoning patterns: reflection-and-forecasting, branching exploration, and rollback. We construct CoT-structured training data and employ supervised fine-tuning (SFT) coupled with a self-improvement mechanism for targeted capability enhancement. Results: Our method achieves significant improvements over existing state-of-the-art approaches on WebVoyager, Mind2Web-Live, and SimpleQA (web search), demonstrating the effectiveness and generalizability of explicit CoT structural modeling and skill-specific optimization.

Technology Category

Application Category

📝 Abstract
Web agents powered by Large Language Models (LLMs) show promise for next-generation AI, but their limited reasoning in uncertain, dynamic web environments hinders robust deployment. In this paper, we identify key reasoning skills essential for effective web agents, i.e., reflection&lookahead, branching, and rollback, and curate trajectory data that exemplifies these abilities by reconstructing the agent's (inference-time) reasoning algorithms into chain-of-thought rationales. We conduct experiments in the agent self-improving benchmark, OpenWebVoyager, and demonstrate that distilling salient reasoning patterns into the backbone LLM via simple fine-tuning can substantially enhance its performance. Our approach yields significant improvements across multiple benchmarks, including WebVoyager, Mind2web-live, and SimpleQA (web search), highlighting the potential of targeted reasoning skill enhancement for web agents.
Problem

Research questions and friction points this paper is trying to address.

Enhancing web agent reasoning in uncertain dynamic environments
Improving reflection branching and rollback skills for web agents
Distilling reasoning patterns to boost LLM performance in web tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reconstructing chain-of-thought for reflection and lookahead
Branching reasoning patterns for dynamic environments
Rollback mechanisms to enhance decision-making
🔎 Similar Papers
No similar papers found.