When Bots Take the Bait: Exposing and Mitigating the Emerging Social Engineering Attack in Web Automation Agent

📅 2026-01-12

📈 Citations: 1

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses a novel class of social engineering attacks targeting web automation agents, which exploit induced contextual cues to manipulate agent behavior—threats that existing defenses struggle to mitigate. We introduce AgentBait, the first attack framework specifically designed for web-based intelligent agents, and propose SUPERVISOR, a lightweight, plug-and-play runtime protection module that blocks such attacks by verifying the consistency between the web environment and the agent’s intended task. Experimental evaluation demonstrates that AgentBait achieves an average success rate of 67.5% against mainstream agent frameworks, whereas integrating SUPERVISOR reduces this success rate by 78.1% with only a 7.7% runtime overhead, effectively balancing security and usability.

Technology Category

Application Category

📝 Abstract

Web agents, powered by large language models (LLMs), are increasingly deployed to automate complex web interactions. The rise of open-source frameworks (e.g., Browser Use, Skyvern-AI) has accelerated adoption, but also broadened the attack surface. While prior research has focused on model threats such as prompt injection and backdoors, the risks of social engineering remain largely unexplored. We present the first systematic study of social engineering attacks against web automation agents and design a pluggable runtime mitigation solution. On the attack side, we introduce the AgentBait paradigm, which exploits intrinsic weaknesses in agent execution: inducement contexts can distort the agent's reasoning and steer it toward malicious objectives misaligned with the intended task. On the defense side, we propose SUPERVISOR, a lightweight runtime module that enforces environment and intention consistency alignment between webpage context and intended goals to mitigate unsafe operations before execution. Empirical results show that mainstream frameworks are highly vulnerable to AgentBait, with an average attack success rate of 67.5% and peaks above 80% under specific strategies (e.g., trusted identity forgery). Compared with existing lightweight defenses, our module can be seamlessly integrated across different web automation frameworks and reduces attack success rates by up to 78.1% on average while incurring only a 7.7% runtime overhead and preserving usability. This work reveals AgentBait as a critical new threat surface for web agents and establishes a practical, generalizable defense, advancing the security of this rapidly emerging ecosystem. We reported the details of this attack to the framework developers and received acknowledgment before submission.

Problem

Research questions and friction points this paper is trying to address.

social engineering

web automation agents

AgentBait

LLM-based agents

attack surface

Innovation

Methods, ideas, or system contributions that make the work stand out.

social engineering attack

web automation agent

AgentBait