🤖 AI Summary
This work investigates the feasibility and theoretical foundations of in-context learning (ICL) in Transformer models under differential privacy (DP) constraints. Addressing the lack of rigorous theoretical support for ICL in privacy-preserving settings, we establish the first formal connection between DP and linear-regression-style ICL, characterizing a precise quantitative trade-off between privacy budget ε and ICL performance. We propose a differentially private pre-training algorithm tailored to linear attention heads, achieving both theoretical optimality and empirical robustness: under ε-DP guarantees, its generalization error bound strictly improves upon that of standard ridge regression, and it exhibits strong resilience against adversarial prompt perturbations. Comprehensive numerical simulations across diverse data distributions and noise regimes empirically validate our theoretical analysis, consistently demonstrating bounded estimation error and high stability.
📝 Abstract
In-context learning (ICL)-the ability of transformer-based models to perform new tasks from examples provided at inference time-has emerged as a hallmark of modern language models. While recent works have investigated the mechanisms underlying ICL, its feasibility under formal privacy constraints remains largely unexplored. In this paper, we propose a differentially private pretraining algorithm for linear attention heads and present the first theoretical analysis of the privacy-accuracy trade-off for ICL in linear regression. Our results characterize the fundamental tension between optimization and privacy-induced noise, formally capturing behaviors observed in private training via iterative methods. Additionally, we show that our method is robust to adversarial perturbations of training prompts, unlike standard ridge regression. All theoretical findings are supported by extensive simulations across diverse settings.