How Private is Your Attention? Bridging Privacy with In-Context Learning

📅 2025-04-22

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work investigates the feasibility and theoretical foundations of in-context learning (ICL) in Transformer models under differential privacy (DP) constraints. Addressing the lack of rigorous theoretical support for ICL in privacy-preserving settings, we establish the first formal connection between DP and linear-regression-style ICL, characterizing a precise quantitative trade-off between privacy budget ε and ICL performance. We propose a differentially private pre-training algorithm tailored to linear attention heads, achieving both theoretical optimality and empirical robustness: under ε-DP guarantees, its generalization error bound strictly improves upon that of standard ridge regression, and it exhibits strong resilience against adversarial prompt perturbations. Comprehensive numerical simulations across diverse data distributions and noise regimes empirically validate our theoretical analysis, consistently demonstrating bounded estimation error and high stability.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms underlying ICL, its feasibility under formal privacy constraints remains largely unexplored. In this paper, we propose a differentially private pretraining algorithm for linear attention heads and present the first theoretical analysis of the privacy-accuracy trade-off for ICL in linear regression. Our results characterize the fundamental tension between optimization and privacy-induced noise, formally capturing behaviors observed in private training via iterative methods. Additionally, we show that our method is robust to adversarial perturbations of training prompts, unlike standard ridge regression. All theoretical findings are supported by extensive simulations across diverse settings.

Problem

Research questions and friction points this paper is trying to address.

Exploring privacy constraints in in-context learning for transformers

Analyzing privacy-accuracy trade-off in differentially private pretraining

Investigating robustness against adversarial prompts in private training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentially private pretraining for linear attention

Theoretical analysis of privacy-accuracy trade-off

Robust to adversarial perturbations in training

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions