AegisAgent: An Autonomous Defense Agent Against Prompt Injection Attacks in LLM-HARs

๐Ÿ“… 2025-12-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the vulnerability of large language model (LLM)-driven wearable human activity recognition (HAR) systems to semantic prompt injection attacks, this paper proposes a novel proactive cognitive defense paradigm that transcends conventional static filtering. Methodologically, it introduces the first autonomous reasoningโ€“based lightweight defense agent architecture, integrating dynamic interaction memory, multi-step semantic verification, and intent repair mechanisms to enable real-time perception, semantic understanding, and runtime mitigation of adversarial instructions. The system supports full-stack deployment, incorporating memory-augmented retrieval, multi-stage verification planning, and repair generation. Evaluations across 15 representative attack types and five state-of-the-art LLM-HAR systems demonstrate an average 30% reduction in attack success rate, with only 78.6 ms GPU inference latency overhead.

Technology Category

Application Category

๐Ÿ“ Abstract
The integration of Large Language Models (LLMs) into wearable sensing is creating a new class of mobile applications capable of nuanced human activity understanding. However, the reliability of these systems is critically undermined by their vulnerability to prompt injection attacks, where attackers deliberately input deceptive instructions into LLMs. Traditional defenses, based on static filters and rigid rules, are insufficient to address the semantic complexity of these new attacks. We argue that a paradigm shift is needed -- from passive filtering to active protection and autonomous reasoning. We introduce AegisAgent, an autonomous agent system designed to ensure the security of LLM-driven HAR systems. Instead of merely blocking threats, AegisAgent functions as a cognitive guardian. It autonomously perceives potential semantic inconsistencies, reasons about the user's true intent by consulting a dynamic memory of past interactions, and acts by generating and executing a multi-step verification and repair plan. We implement AegisAgent as a lightweight, full-stack prototype and conduct a systematic evaluation on 15 common attacks with five state-of-the-art LLM-based HAR systems on three public datasets. Results show it reduces attack success rate by 30% on average while incurring only 78.6 ms of latency overhead on a GPU workstation. Our work makes the first step towards building secure and trustworthy LLM-driven HAR systems.
Problem

Research questions and friction points this paper is trying to address.

Defends LLM-based activity recognition from prompt injection attacks
Replaces static filters with autonomous reasoning for semantic threats
Ensures security by verifying intent and repairing system responses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous agent system for active protection
Dynamic memory and reasoning for intent verification
Multi-step verification and repair plan execution
๐Ÿ”Ž Similar Papers
No similar papers found.