When Skills Lie: Hidden-Comment Injection in LLM Agents

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work reveals that large language model (LLM) agents parsing skill documents are vulnerable to prompt injection attacks via hidden HTML comments inadvertently retained during Markdown-to-HTML conversion, potentially triggering unintended and sensitive tool invocations. For the first time, the study demonstrates how such HTML comments embedded in skill documentation can be weaponized for adversarial purposes and proposes a defense paradigm that treats skill specifications as untrusted inputs. Through injection experiments and comment mechanism analysis on DeepSeek-V3.2 and GLM-4.5-Air models, the authors design targeted defensive system prompts. Experimental results show that this approach effectively identifies and blocks malicious instructions, significantly enhancing agent robustness against concealed threats within skill documents.

Technology Category

Application Category

📝 Abstract
LLM agents often rely on Skills to describe available tools and recommended procedures. We study a hidden-comment prompt injection risk in this documentation layer: when a Markdown Skill is rendered to HTML, HTML comment blocks can become invisible to human reviewers, yet the raw text may still be supplied verbatim to the model. In experiments, we find that DeepSeek-V3.2 and GLM-4.5-Air can be influenced by malicious instructions embedded in a hidden comment appended to an otherwise legitimate Skill, yielding outputs that contain sensitive tool intentions. A short defensive system prompt that treats Skills as untrusted and forbids sensitive actions prevents these malicious tool calls and instead surfaces the suspicious hidden instructions.
Problem

Research questions and friction points this paper is trying to address.

prompt injection
LLM agents
hidden comment
skill documentation
HTML comment
Innovation

Methods, ideas, or system contributions that make the work stand out.

hidden-comment injection
LLM agents
prompt injection
Skill documentation
defensive prompting
🔎 Similar Papers
No similar papers found.
Qianli Wang
Qianli Wang
DFKI & TU Berlin
ExplainabilityNatural Language Processing
B
Boyang Ma
Shandong University
M
Minghui Xu
Shandong University
Y
Yue Zhang
Shandong University