🤖 AI Summary
Existing autonomous web navigation methods suffer from “topological blindness,” relying heavily on trial-and-error exploration and struggling to achieve efficient navigation in complex scenarios. This work proposes a novel interaction-graph-based “retrieve–reason–teleport” framework that, for the first time, leverages zero-token-cost offline heuristic exploration to construct a global topological map. This paradigm shift transforms navigation from probabilistic exploration to deterministic path planning. The proposed method achieves state-of-the-art performance on WebArena and OnlineMind2Web benchmarks, attaining a 72.9% success rate on multi-site WebArena tasks—more than double the performance of current enterprise-grade agents.
📝 Abstract
Despite significant advances in autonomous web navigation, current methods remain far from human-level performance in complex web environments. We argue that this limitation stems from Topological Blindness, where agents are forced to explore via trial-and-error without access to the global topological structure of the environment. To overcome this limitation, we introduce WebNavigator, which reframes web navigation from probabilistic exploration into deterministic retrieval and pathfinding. WebNavigator constructs Interaction Graphs via zero-token cost heuristic exploration offline and implements a Retrieve-Reason-Teleport workflow for global navigation online. WebNavigator achieves state-of-the-art performance on WebArena and OnlineMind2Web. On WebArena multi-site tasks, WebNavigator achieves a 72.9\% success rate, more than doubling the performance of enterprise-level agents. This work reveals that Topological Blindness, rather than model reasoning capabilities alone, is an underestimated bottleneck in autonomous web navigation.