The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis

πŸ“… 2026-02-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the vulnerability of large language model (LLM) agents to prompt injection attacks in context-dependent tasks, a threat inadequately evaluated by existing defense mechanisms. Through a systematic literature review and quantitative analysis, the study establishes a comprehensive taxonomy of prompt injection attacks and defenses, and introduces AgentPIβ€”the first security evaluation benchmark specifically designed for context-dependent agent scenarios. Empirical results demonstrate that while current defense methods perform adequately on traditional benchmarks, they fail significantly under AgentPI due to an inability to simultaneously achieve high security, practical usability, and low latency. This failure exposes a critical gap in existing approaches: the neglect of runtime environmental context in security design.

Technology Category

Application Category

πŸ“ Abstract
The evolution of Large Language Models (LLMs) has resulted in a paradigm shift towards autonomous agents, necessitating robust security against Prompt Injection (PI) vulnerabilities where untrusted inputs hijack agent behaviors. This SoK presents a comprehensive overview of the PI landscape, covering attacks, defenses, and their evaluation practices. Through a systematic literature review and quantitative analysis, we establish taxonomies that categorize PI attacks by payload generation strategies (heuristic vs. optimization) and defenses by intervention stages (text, model, and execution levels). Our analysis reveals a key limitation shared by many existing defenses and benchmarks: they largely overlook context-dependent tasks, in which agents are authorized to rely on runtime environmental observations to determine actions. To address this gap, we introduce AgentPI, a new benchmark designed to systematically evaluate agent behavior under context-dependent interaction settings. Using AgentPI, we empirically evaluate representative defenses and show that no single approach can simultaneously achieve high trustworthiness, high utility, and low latency. Moreover, we show that many defenses appear effective under existing benchmarks by suppressing contextual inputs, yet fail to generalize to realistic agent settings where context-dependent reasoning is essential. This SoK distills key takeaways and open research problems, offering structured guidance for future research and practical deployment of secure LLM agents.
Problem

Research questions and friction points this paper is trying to address.

Prompt Injection
LLM Agents
Context-Dependent Tasks
Security Evaluation
Autonomous Agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt Injection
LLM Agents
AgentPI
Context-dependent Tasks
Security Benchmark
πŸ”Ž Similar Papers
No similar papers found.