BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses prompt injection attacks—a novel, real-world security threat in AI-powered browser agents that falls outside conventional web threat models and whose practical impact remains poorly understood. To overcome the limitations of existing evaluations—namely, their lack of realistic HTML payloads and operation-level impact analysis—we introduce the first benchmark tailored to authentic browsing behavior, featuring diverse, semantically rich HTML payloads and environmental perturbations. We conduct the first systematic empirical assessment of mainstream AI models’ vulnerability under realistic interactive conditions. Furthermore, we propose a multi-layered defense architecture integrating system-level mitigations (e.g., DOM sandboxing, instruction filtering) with model-level techniques (e.g., adversarial fine-tuning, response verification). Experimental results demonstrate substantial improvements in robustness against prompt injection. Our benchmark and defense framework provide a reproducible evaluation methodology and a practical, deployable security blueprint for building trustworthy AI browser agents.

Technology Category

Application Category

📝 Abstract

The integration of artificial intelligence (AI) agents into web browsers introduces security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real-world environments remains insufficiently understood. In this work, we examine the landscape of prompt injection attacks and synthesize a benchmark of attacks embedded in realistic HTML payloads. Our benchmark goes beyond prior work by emphasizing injections that can influence real-world actions rather than mere text outputs, and by presenting attack payloads with complexity and distractor frequency similar to what real-world agents encounter. We leverage this benchmark to conduct a comprehensive empirical evaluation of existing defenses, assessing their effectiveness across a suite of frontier AI models. We propose a multi-layered defense strategy comprising both architectural and model-based defenses to protect against evolving prompt injection attacks. Our work offers a blueprint for designing practical, secure web agents through a defense-in-depth approach.

Problem

Research questions and friction points this paper is trying to address.

Analyzing prompt injection attack landscape in AI browser agents

Evaluating existing defenses against realistic HTML payload injections

Proposing multi-layered defense strategy for secure web agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes multi-layered defense strategy against attacks

Creates benchmark with realistic HTML injection payloads

Combines architectural and model-based defense solutions

🔎 Similar Papers

No similar papers found.