BodhiPromptShield: Pre-Inference Prompt Mediation for Suppressing Privacy Propagation in LLM/VLM Agents

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the risk of user privacy leakage in multi-stage interactions involving large language models and vision-language agents, particularly during retrieval, memory access, and tool invocation. To mitigate this, the authors propose a strategy-aware prompt mediation framework that dynamically identifies sensitive content prior to inference and applies context-appropriate sanitization—via typed placeholders, semantic abstraction, or secure symbolic mapping—while restoring original data only within authorized boundaries. The approach innovatively incorporates propagation-aware mediation and precise restoration timing control, extending privacy protection from static redaction to dynamic, cross-stage safeguarding. Evaluated on the CPPB benchmark, the method reduces privacy propagation rate from 10.7% to 7.1%, achieving a Privacy Exposure Rate (PER) of 9.3%, an Accuracy Consistency (AC) of 0.94, and a Task Success Rate (TSR) of 0.92, significantly outperforming conventional sanitization techniques.
📝 Abstract
In LLM/VLM agents, prompt privacy risk propagates beyond a single model call because raw user content can flow into retrieval queries, memory writes, tool calls, and logs. Existing de-identification pipelines address document boundaries but not this cross-stage propagation. We propose BodhiPromptShield, a policy-aware framework that detects sensitive spans, routes them via typed placeholders, semantic abstraction, or secure symbolic mapping, and delays restoration to authorized boundaries. Relative to enterprise redaction, this adds explicit propagation-aware mediation and restoration timing as a security variable. Under controlled evaluation on the Controlled Prompt-Privacy Benchmark (CPPB), stage-wise propagation suppresses from 10.7\% to 7.1\% across retrieval, memory, and tool stages; PER reaches 9.3\% with 0.94 AC and 0.92 TSR, outperforming generic de-identification. These are controlled systems results on CPPB rather than formal privacy guarantees or public-benchmark transfer claims. The project repository is available at https://github.com/mabo1215/BodhiPromptShield.git.
Problem

Research questions and friction points this paper is trying to address.

privacy propagation
LLM/VLM agents
cross-stage leakage
prompt privacy
de-identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt privacy
propagation-aware mediation
semantic abstraction
secure symbolic mapping
LLM/VLM agents
B
Bo Ma
Auckland University of Technology, Auckland 1024, New Zealand
Jinsong Wu
Jinsong Wu
University of Chile, Chile
green technologiesdata-driven sustainabilitysustainable engineeringbig dataInternet of things
W
Weiqi Yan
Auckland University of Technology, Auckland 1024, New Zealand