Learning Rate Scheduling with Matrix Factorization for Private Training

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the mismatch between learning rate scheduling and noise mechanisms in differentially private stochastic gradient descent (DP-SGD). Existing theoretical analyses predominantly assume constant learning rates, rendering them ill-suited for practical training with dynamic schedulers. We propose the first learning-rate-aware matrix decomposition noise design, explicitly incorporating the learning rate sequence into the noise covariance structure. We derive tight upper and lower bounds on convergence error under common schedulers—including Step and Cosine—and demonstrate substantial improvements over conventional prefix-sum decomposition on MaxSE and MeanSE metrics. The method is memory-efficient, compatible with both single- and multi-round DP-SGD, and requires no modifications to optimizer logic. Experiments on CIFAR-10 and IMDB show accuracy gains of 2.1–3.7 percentage points under identical privacy budgets, effectively bridging the theory-practice gap in private optimization.

Technology Category

Application Category

📝 Abstract

We study differentially private model training with stochastic gradient descent under learning rate scheduling and correlated noise. Although correlated noise, in particular via matrix factorizations, has been shown to improve accuracy, prior theoretical work focused primarily on the prefix-sum workload. That workload assumes a constant learning rate, whereas in practice learning rate schedules are widely used to accelerate training and improve convergence. We close this gap by deriving general upper and lower bounds for a broad class of learning rate schedules in both single- and multi-epoch settings. Building on these results, we propose a learning-rate-aware factorization that achieves improvements over prefix-sum factorizations under both MaxSE and MeanSE error metrics. Our theoretical analysis yields memory-efficient constructions suitable for practical deployment, and experiments on CIFAR-10 and IMDB datasets confirm that schedule-aware factorizations improve accuracy in private training.

Problem

Research questions and friction points this paper is trying to address.

Optimizing private training with correlated noise under varying learning rates

Developing theoretical bounds for learning rate schedules in differential privacy

Creating schedule-aware matrix factorizations to enhance private training accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Matrix factorization improves private training accuracy

Learning-rate-aware factorization outperforms prefix-sum methods

Memory-efficient constructions enable practical deployment

🔎 Similar Papers

Banded Square Root Matrix Factorization for Differentially Private Model Training