Privacy-Aware In-Context Learning for Large Language Models

📅 2025-09-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address privacy risks arising from sensitive information leakage in prompts during in-context learning (ICL) with large language models (LLMs), this paper proposes the first inference-time, fine-tuning-free differentially private synthetic text generation framework. Our method innovatively integrates differential privacy into the ICL inference process by designing a token-level output distribution perturbation mechanism and combining public and private inference results via weighted aggregation, achieving end-to-end privacy guarantees. Without any model fine-tuning, it is fully compatible with standard LLMs and provides theoretically rigorous ε-differential privacy. Empirical evaluation across multiple ICL benchmarks demonstrates that our approach significantly outperforms existing methods: under stringent privacy constraints (ε ≤ 2), it maintains superior text coherence and task accuracy, effectively alleviating the privacy–utility trade-off bottleneck.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have significantly transformed natural language understanding and generation, but they raise privacy concerns due to potential exposure of sensitive information. Studies have highlighted the risk of information leakage, where adversaries can extract sensitive information embedded in the prompts. In this work, we introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees. Our approach leverages the Differential Privacy (DP) framework to ensure worst-case theoretical bounds on information leakage without requiring any fine-tuning of the underlying models.The proposed method performs inference on private records and aggregates the resulting per-token output distributions. This enables the generation of longer and coherent synthetic text while maintaining privacy guarantees. Additionally, we propose a simple blending operation that combines private and public inference to further enhance utility. Empirical evaluations demonstrate that our approach outperforms previous state-of-the-art methods on in-context-learning (ICL) tasks, making it a promising direction for privacy-preserving text generation while maintaining high utility.
Problem

Research questions and friction points this paper is trying to address.

Addressing privacy risks in large language models
Preventing sensitive information leakage from prompts
Generating private synthetic text with high utility
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy framework for worst-case bounds
Aggregates per-token distributions for coherent text
Blends private and public inference to enhance utility
🔎 Similar Papers
No similar papers found.
B
Bishnu Bhusal
University of Missouri, 411 S 6th St, Columbia, MO 65211, USA
Manoj Acharya
Manoj Acharya
SRI International
Artificial IntelligenceComputer VisionNLPVisual Question Answering
Ramneet Kaur
Ramneet Kaur
Advanced Computer Scientist, SRI
Trustworthy AIInterpretabilityReliabilityConformal PredictionGenAI
C
Colin Samplawski
SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA
A
Anirban Roy
SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA
A
Adam D. Cobb
SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA
Rohit Chadha
Rohit Chadha
University of Missouri
Formal methodsSecuritySoftware VerificationApplications of Logic to Computer Science
Susmit Jha
Susmit Jha
Director, Neurosymbolic Computing and Intelligence, SRI International
Aritificial IntelligenceAutonomyFormal MethodsMachine Learning