Public Data Assisted Differentially Private In-Context Learning

📅 2025-09-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address privacy risks from prompt leakage in large language models’ (LLMs) in-context learning (ICL) and the substantial utility degradation under differential privacy (DP), this paper proposes a public-data-augmented DP-ICL framework. Our method integrates task-relevant public data into private ICL, enabling privacy-aware prompt engineering and controlled noise injection to guarantee strict $(varepsilon,delta)$-DP while enhancing model utility. Experiments across text classification and question answering tasks demonstrate an average 12.3% accuracy improvement over non-private baselines and significant gains over existing DP-ICL approaches. Moreover, the framework effectively resists membership inference attacks. By synergistically leveraging public data and calibrated perturbation, our approach achieves a practical balance between rigorous privacy protection and high task performance—advancing the state of privacy-preserving ICL.

Technology Category

Application Category

📝 Abstract
In-context learning (ICL) in Large Language Models (LLMs) has shown remarkable performance across various tasks without requiring fine-tuning. However, recent studies have highlighted the risk of private data leakage through the prompt in ICL, especially when LLMs are exposed to malicious attacks. While differential privacy (DP) provides strong privacy guarantees, it often significantly reduces the utility of in-context learning (ICL). To address this challenge, we incorporate task-related public data into the ICL framework while maintaining the DP guarantee. Based on this approach, we propose a private in-context learning algorithm that effectively balances privacy protection and model utility. Through experiments, we demonstrate that our approach significantly improves the utility of private ICL with the assistance of public data. Additionally, we show that our method is robust against membership inference attacks, demonstrating empirical privacy protection.
Problem

Research questions and friction points this paper is trying to address.

Addresses private data leakage risk in in-context learning
Balances differential privacy protection with model utility
Uses public data to enhance private learning performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Public data integration for privacy
Differential privacy with utility preservation
Robust against membership inference attacks
🔎 Similar Papers
No similar papers found.