🤖 AI Summary
Current web agents rely on low-level operations such as clicking and typing, resulting in task execution that is fragile, inefficient, and difficult to verify. This work proposes Web Verbs—the first semantic action layer for “agent-native” web interaction—that unifies browser-based actions and API calls through typed verb abstractions equipped with preconditions, postconditions, policy tags, and execution logs. By integrating large language models, semantic document representations, and functional interfaces, the approach enables agents to automatically discover, compose, and generate reliable, auditable workflows. Case studies demonstrate that Web Verbs can compress multi-step interactions—often involving dozens of primitive actions—into a small number of high-level function calls, substantially improving the reliability, efficiency, and verifiability of task execution.
📝 Abstract
The Web is evolving from a medium that humans browse to an environment where software agents act on behalf of users. Advances in large language models (LLMs) make natural language a practical interface for goal-directed tasks, yet most current web agents operate on low-level primitives such as clicks and keystrokes. These operations are brittle, inefficient, and difficult to verify. Complementing content-oriented efforts such as NLWeb's semantic layer for retrieval, we argue that the agentic web also requires a semantic layer for web actions. We propose \textbf{Web Verbs}, a web-scale set of typed, semantically documented functions that expose site capabilities through a uniform interface, whether implemented through APIs or robust client-side workflows. These verbs serve as stable and composable units that agents can discover, select, and synthesize into concise programs. This abstraction unifies API-based and browser-based paradigms, enabling LLMs to synthesize reliable and auditable workflows with explicit control and data flow. Verbs can carry preconditions, postconditions, policy tags, and logging support, which improves \textbf{reliability} by providing stable interfaces, \textbf{efficiency} by reducing dozens of steps into a few function calls, and \textbf{verifiability} through typed contracts and checkable traces. We present our vision, a proof-of-concept implementation, and representative case studies that demonstrate concise and robust execution compared to existing agents. Finally, we outline a roadmap for standardization to make verbs deployable and trustworthy at web scale.