Differentially Private Subspace Fine-Tuning for Large Language Models

📅 2026-01-16

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This work addresses the significant performance degradation and training instability commonly observed in differentially private (DP) fine-tuning of large language models, which arises from injecting noise into high-dimensional parameter spaces. To mitigate this issue, the authors propose DP-SFT, a novel approach that confines DP noise to a task-relevant low-dimensional gradient subspace. Specifically, the method first identifies dominant gradient directions via principal component analysis, then performs gradient projection, noise injection, and parameter mapping within this reduced subspace. By concentrating noise where it matters most, DP-SFT substantially reduces the required noise magnitude while preserving rigorous privacy guarantees. Experimental results demonstrate that DP-SFT consistently outperforms existing DP fine-tuning baselines across multiple datasets, achieving notable improvements in fine-tuning accuracy, training stability, and convergence speed.

Technology Category

Application Category

📝 Abstract

Fine-tuning large language models on downstream tasks is crucial for realizing their cross-domain potential but often relies on sensitive data, raising privacy concerns. Differential privacy (DP) offers rigorous privacy guarantees and has been widely adopted in fine-tuning; however, naively injecting noise across the high-dimensional parameter space creates perturbations with large norms, degrading performance and destabilizing training. To address this issue, we propose DP-SFT, a two-stage subspace fine-tuning method that substantially reduces noise magnitude while preserving formal DP guarantees. Our intuition is that, during fine-tuning, significant parameter updates lie within a low-dimensional, task-specific subspace, while other directions change minimally. Hence, we only inject DP noise into this subspace to protect privacy without perturbing irrelevant parameters. In phase one, we identify the subspace by analyzing principal gradient directions to capture task-specific update signals. In phase two, we project full gradients onto this subspace, add DP noise, and map the perturbed gradients back to the original parameter space for model updates, markedly lowering noise impact. Experiments on multiple datasets demonstrate that DP-SFT enhances accuracy and stability under rigorous DP constraints, accelerates convergence, and achieves substantial gains over DP fine-tuning baselines.

Problem

Research questions and friction points this paper is trying to address.

differential privacy

fine-tuning

large language models

privacy-preserving machine learning

high-dimensional noise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy

Subspace Fine-Tuning

Large Language Models