WebTrap Park: An Automated Platform for Systematic Security Evaluation of Web Agents

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of systematic and standardized methods for evaluating the safety of Web Agents in real-world web environments. It proposes the first non-intrusive, automated evaluation framework that assesses agent safety by observing interactions with live websites, generating 1,226 executable safety evaluation tasks spanning three major risk categories. By integrating real-web interaction, automatic task generation, and action-level automated assessment, the approach demonstrates that agent architecture significantly influences safety outcomes—surpassing evaluation paradigms that rely solely on underlying language models. Empirical results reveal substantial safety disparities across different agent frameworks. The authors have open-sourced their platform to establish a reproducible and extensible foundation for future Web Agent safety evaluation.

Technology Category

Application Category

📝 Abstract
Web Agents are increasingly deployed to perform complex tasks in real web environments, yet their security evaluation remains fragmented and difficult to standardize. We present WebTrap Park, an automated platform for systematic security evaluation of Web Agents through direct observation of their concrete interactions with live web pages. WebTrap Park instantiates three major sources of security risk into 1,226 executable evaluation tasks and enables action based assessment without requiring agent modification. Our results reveal clear security differences across agent frameworks, highlighting the importance of agent architecture beyond the underlying model. WebTrap Park is publicly accessible at https://security.fudan.edu.cn/webagent and provides a scalable foundation for reproducible Web Agent security evaluation.
Problem

Research questions and friction points this paper is trying to address.

Web Agents
security evaluation
systematic assessment
web security
automated platform
Innovation

Methods, ideas, or system contributions that make the work stand out.

Web Agents
Security Evaluation
Automated Platform
Action-based Assessment
Agent Architecture
🔎 Similar Papers
No similar papers found.