PhantomLint: Principled Detection of Hidden LLM Prompts in Structured Documents

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

163K/year

🤖 AI Summary

Hidden prompt injection attacks in structured documents (e.g., resumes, academic papers) pose a critical threat to LLM-based systems, particularly when malicious prompts are embedded visually—without explicit textual markers—in PDF/HTML formats. Method: This paper introduces the first systematic detection framework for such attacks across document formats, integrating lightweight static analysis with context-aware semantic detection. We implement PhantomLint, a prototype tool designed for practical deployment. Contribution/Results: Our key innovations include the first multimodal modeling of visually concealed prompts and a cross-format robust detection strategy that significantly reduces false positives. Evaluated on 3,402 real-world documents—including preprints, resumes, and scholarly articles—PhantomLint achieves an average false positive rate of only 0.092%, demonstrating high accuracy, low computational overhead, and strong generalization across diverse document types and layouts. The approach provides a deployable, trustworthy safeguard for AI-augmented decision-making systems.

Technology Category

Application Category

📝 Abstract

Hidden LLM prompts have appeared in online documents with increasing frequency. Their goal is to trigger indirect prompt injection attacks while remaining undetected from human oversight, to manipulate LLM-powered automated document processing systems, against applications as diverse as résumé screeners through to academic peer review processes. Detecting hidden LLM prompts is therefore important for ensuring trust in AI-assisted human decision making. This paper presents the first principled approach to hidden LLM prompt detection in structured documents. We implement our approach in a prototype tool called PhantomLint. We evaluate PhantomLint against a corpus of 3,402 documents, including both PDF and HTML documents, and covering academic paper preprints, CVs, theses and more. We find that our approach is generally applicable against a wide range of methods for hiding LLM prompts from visual inspection, has a very low false positive rate (approx. 0.092%), is practically useful for detecting hidden LLM prompts in real documents, while achieving acceptable performance.

Problem

Research questions and friction points this paper is trying to address.

Detecting hidden LLM prompts in structured documents

Preventing indirect prompt injection attacks on AI systems

Ensuring trust in AI-assisted human decision making

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel detection method for hidden LLM prompts

Prototype tool PhantomLint implementation

Low false positive rate performance

🔎 Similar Papers

No similar papers found.