Sample-Efficient Differentially Private Fine-Tuning via Gradient Matrix Denoising

📅 2025-10-01

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Differential privacy stochastic gradient descent (DP-SGD) suffers from gradient matrix entropy inflation, degradation of intrinsic low-rank structure, and poor sample efficiency during fine-tuning of large language models, primarily due to noise injection. To address this, we propose a gradient denoising method grounded in random matrix theory (RMT), which performs post-hoc low-rank recovery on the noisy gradient matrix without altering the training pipeline or consuming additional privacy budget. Our approach leverages RMT to robustly estimate and restore the underlying low-rank subspace of gradients, thereby preserving structural fidelity under privacy constraints. Crucially, this work establishes the first principled integration of RMT with differentially private optimization, enabling synergistic improvements in both privacy preservation and optimization efficiency. Experiments on RoBERTa across GLUE benchmarks demonstrate that our method significantly reduces the number of training steps and data volume required to reach target accuracy—achieving 30–50% gains in sample efficiency.

Technology Category

Application Category

📝 Abstract

We address the challenge of sample efficiency in differentially private fine-tuning of large language models (LLMs) using DP-SGD. While DP-SGD provides strong privacy guarantees, the added noise significantly increases the entropy of gradient matrices, disrupting their low-rank structure and slowing optimization. We propose a post-processing algorithm that leverages random matrix theory to denoise gradients, restore low-rank structure, and improve alignment with the original signal. Applied to DP-SGD fine-tuning of RoBERTa on GLUE tasks, our method improves sample efficiency compared to state-of-the-art approaches, substantially reducing training time when optimal performance is not required. This work demonstrates that matrix recovery techniques can enhance the utility of private language model training without compromising privacy guarantees.

Problem

Research questions and friction points this paper is trying to address.

Improving sample efficiency in differentially private LLM fine-tuning

Reducing gradient matrix noise to restore low-rank structure

Accelerating private training while maintaining privacy guarantees

Innovation

Methods, ideas, or system contributions that make the work stand out.

Denoises gradients using random matrix theory

Restores low-rank structure in gradient matrices

Improves sample efficiency in private fine-tuning

🔎 Similar Papers

Characterizing the Training Dynamics of Private Fine-tuning with Langevin diffusion