π€ AI Summary
This work addresses the critical security vulnerabilities of navigation agents under prompt injection attacks, which can lead to path misdirection, task failure, or even real-world harm. It presents the first systematic study of prompt injection attacks in embodied navigation scenarios and introduces PINA, a framework that adaptively optimizes malicious prompts to effectively compromise large language modelβbased navigation agents under black-box settings, long-context constraints, and action feasibility requirements. Experimental results demonstrate that PINA achieves an average attack success rate of 87.5% across indoor and outdoor environments, substantially outperforming baseline methods. Furthermore, it maintains robust performance under ablation studies and adaptive attack conditions, revealing the inherent security weaknesses of embodied agents and establishing a targeted attack paradigm for evaluating their resilience.
π Abstract
Navigation agents powered by large language models (LLMs) convert natural language instructions into executable plans and actions. Compared to text-based applications, their security is far more critical: a successful prompt injection attack does not just alter outputs but can directly misguide physical navigation, leading to unsafe routes, mission failure, or real-world harm. Despite this high-stakes setting, the vulnerability of navigation agents to prompt injection remains largely unexplored. In this paper, we propose PINA, an adaptive prompt optimization framework tailored to navigation agents under black-box, long-context, and action-executable constraints. Experiments on indoor and outdoor navigation agents show that PINA achieves high attack success rates with an average ASR of 87.5%, surpasses all baselines, and remains robust under ablation and adaptive-attack conditions. This work provides the first systematic investigation of prompt injection attacks in navigation and highlights their urgent security implications for embodied LLM agents.