🤖 AI Summary
This work addresses the vulnerability of prompt-based intellectual property in large language model (LLM) agents, which can be easily stolen in untrusted environments and reused across models, leading to economic losses. To counter this, the authors propose an active, runtime-effective prompt protection mechanism that anchors semantic meaning through code-like symbols and injects functionality-preserving noise guided by feedback from the target model. The resulting obfuscated prompts are effective only on the designated LLM and resist cross-model portability. This approach is the first to simultaneously satisfy four critical requirements: proactiveness, runtime protection, usability, and non-portability. Extensive experiments demonstrate that the method significantly reduces prompt transferability across diverse agent systems, datasets, and base models while preserving performance on the target model and exhibiting robustness against adaptive attacks.
📝 Abstract
LLM agents rely on prompts to implement task-specific capabilities based on foundation LLMs, making agent prompts valuable intellectual property. However, in untrusted deployments, adversaries can copy and reuse these prompts with other proprietary LLMs, causing economic losses. To protect these prompts, we identify four key challenges: proactivity, runtime protection, usability, and non-portability that existing approaches fail to address. We present PragLocker, a prompt protection scheme that satisfies these requirements. PragLocker constructs function-preserving obfuscated prompts by anchoring semantics with code symbols and then using target-model feedback to inject noise, yielding prompts that only work on the target LLM. Experiments across multiple agent systems, datasets, and foundation LLMs show that PragLocker substantially reduces cross-LLM portability, maintains target performance, and remains robust against adaptive attackers.