SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first self-healing multi-agent defense framework against sponge attacks—adversarial inputs that induce excessive computation in large language models (LLMs) to exhaust system resources. The framework employs a three-stage defense pipeline integrating semantic similarity retrieval, pattern matching, and LLM-based reasoning. Crucially, it incorporates a closed-loop self-healing mechanism through knowledge-updating and prompt-optimization agents, enabling dynamic adaptation of defense strategies even after an attack bypasses initial detection. Experimental results demonstrate that the approach achieves high F1 scores against both non-semantic and semantic sponge attacks, significantly outperforming existing defenses based on perplexity thresholds or single LLM inference.

Technology Category

Application Category

📝 Abstract
Sponge attacks increasingly threaten LLM systems by inducing excessive computation and DoS. Existing defenses either rely on statistical filters that fail on semantically meaningful attacks or use static LLM-based detectors that struggle to adapt as attack strategies evolve. We introduce SHIELD, a multi-agent, auto-healing defense framework centered on a three-stage Defense Agent that integrates semantic similarity retrieval, pattern matching, and LLM-based reasoning. Two auxiliary agents, a Knowledge Updating Agent and a Prompt Optimization Agent, form a closed self-healing loop, when an attack bypasses detection, the system updates an evolving knowledgebase, and refines defense instructions. Extensive experiments show that SHIELD consistently outperforms perplexity-based and standalone LLM defenses, achieving high F1 scores across both non-semantic and semantic sponge attacks, demonstrating the effectiveness of agentic self-healing against evolving resource-exhaustion threats.
Problem

Research questions and friction points this paper is trying to address.

resource exhaustion attacks
sponge attacks
LLM security
DoS
semantic attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

auto-healing
multi-agent defense
LLM resource exhaustion
sponge attacks
semantic-aware detection
🔎 Similar Papers
No similar papers found.
N
Nirhoshan Sivaroopan
The University of Sydney
Kanchana Thilakarathna
Kanchana Thilakarathna
University of Sydney
Mobile SystemsPrivacy & SecurityComputer NetworksMachine Learning
A
Albert Y. Zomaya
The University of Sydney
M
Manu
Western Sydney University
Yi Guo
Yi Guo
Western Sydney University
machine learning
Jo Plested
Jo Plested
University of New South Wales
Deep LearningTransfer Learning
T
Tim Lynar
University of New South Wales
Jack Yang
Jack Yang
Senior Lecturer, University of New South Wales
Computational Material Science
W
Wangliu Yang
University of Wollongong