🤖 AI Summary
Current AI web agents exhibit low reliability and poor reusability in web automation tasks (e.g., form filling, information retrieval), heavily relying on repeated human guidance. This paper proposes a workflow synthesis framework augmented with execution guards, which automatically synthesizes reusable, verifiable workflows from both successful and failed execution traces via program synthesis, execution trace analysis, and anomaly detection. The agent autonomously detects anomalies during execution, triggers self-repair mechanisms, and enables real-time user monitoring of root causes. Evaluated on 15 real-world tasks, our approach increases task success rate from 24.2% to 70.1%. User studies confirm a significant reduction in human intervention frequency and demonstrate superior execution efficiency over baseline methods. Our core contribution is the first reusable workflow generation model endowed with online feedback-driven correction capability.
📝 Abstract
AI-powered web agents have the potential to automate repetitive tasks, such as form filling, information retrieval, and scheduling, but they struggle to reliably execute these tasks without human intervention, requiring users to provide detailed guidance during every run. We address this limitation by automatically synthesizing reusable workflows from an agent's successful and failed attempts. These workflows incorporate execution guards that help agents detect and fix errors while keeping users informed of progress and issues. Our approach enables agents to successfully complete repetitive tasks of the same type with minimal intervention, increasing the success rates from 24.2% to 70.1% across fifteen tasks. To evaluate this approach, we invited nine users and found that our agent helped them complete web tasks with a higher success rate and less guidance compared to two baseline methods, as well as allowed users to easily monitor agent behavior and understand failures.