๐ค AI Summary
In large language model (LLM) inference, submitting user prompts containing sensitive information via proprietary APIs risks privacy leakage. To address this, we propose the first cryptography-inspired prompt sanitization framework. We formally model sensitive terms as either format-dependent (e.g., ID numbers) or value-dependent (e.g., disease names), and protect them respectively via format-preserving encryption (FPE) and metric differential privacy (mDP). We further establish a privacyโutility trade-off evaluation framework that maximizes response quality under provably guaranteed privacy. Extensive experiments across multiple NLP tasks demonstrate that our approach significantly outperforms existing baselines: it achieves stronger formal privacy guarantees while preserving high response utility.
๐ Abstract
The rise of large language models (LLMs) has introduced new privacy challenges, particularly during inference where sensitive information in prompts may be exposed to proprietary LLM APIs. In this paper, we address the problem of formally protecting the sensitive information contained in a prompt while maintaining response quality. To this end, first, we introduce a cryptographically inspired notion of a prompt sanitizer which transforms an input prompt to protect its sensitive tokens. Second, we propose Pr$epsilonepsilon$mpt, a novel system that implements a prompt sanitizer. Pr$epsilonepsilon$mpt categorizes sensitive tokens into two types: (1) those where the LLM's response depends solely on the format (such as SSNs, credit card numbers), for which we use format-preserving encryption (FPE); and (2) those where the response depends on specific values, (such as age, salary) for which we apply metric differential privacy (mDP). Our evaluation demonstrates that Pr$epsilonepsilon$mpt is a practical method to achieve meaningful privacy guarantees, while maintaining high utility compared to unsanitized prompts, and outperforming prior methods