STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

📅 2026-03-05

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the limitations of existing large language model (LLM) agents in complex, long-horizon web tasks, where constrained context memory, inadequate planning capabilities, and greedy action selection often lead to premature failure. To overcome these challenges, we propose STRUCTUREDAGENT, a novel framework that integrates a dynamic AND/OR tree with a structured memory mechanism to enable interpretable, online hierarchical planning. The AND/OR tree facilitates efficient search and maintenance of candidate solutions, while the structured memory enhances long-term reasoning and fault tolerance. Extensive experiments on WebVoyager, WebArena, and a custom shopping benchmark demonstrate that our approach significantly outperforms standard LLM agents, achieving substantial improvements in both constraint satisfaction and overall task success rates.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) have enabled agentic systems for sequential decision-making. Such agents must perceive their environment, reason across multiple time steps, and take actions that optimize long-term objectives. However, existing web agents struggle on complex, long-horizon tasks due to limited in-context memory for tracking history, weak planning abilities, and greedy behaviors that lead to premature termination. To address these challenges, we propose STRUCTUREDAGENT, a hierarchical planning framework with two core components: (1) an online hierarchical planner that uses dynamic AND/OR trees for efficient search and (2) a structured memory module that tracks and maintains candidate solutions to improve constraint satisfaction in information-seeking tasks. The framework also produces interpretable hierarchical plans, enabling easier debugging and facilitating human intervention when needed. Our results on WebVoyager, WebArena, and custom shopping benchmarks show that STRUCTUREDAGENT improves performance on long-horizon web-browsing tasks compared to standard LLM-based agents.

Problem

Research questions and friction points this paper is trying to address.

long-horizon web tasks

LLM-based agents

planning

in-context memory

greedy behavior

Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical planning

AND/OR trees

structured memory