🤖 AI Summary
This work addresses the challenge of balancing memory efficiency and utility in multi-round differentially private training by proposing the γ-BIFR factorization method. Leveraging banded inverse factorization, γ-BIFR constructs an explicit covariance matrix decomposition with a tunable parameter γ, unifying and generalizing existing low-memory, high-bandwidth correlated noise mechanisms. The approach enables flexible noise buffer configurations and significantly outperforms DP-λCGD and BISR under low-bandwidth and low-memory constraints. It achieves higher model utility while reducing both RMSE and amplified RMSE, and provides a tighter theoretical bound on the multi-round participation error.
📝 Abstract
Correlated-noise mechanisms are among the most promising approaches for improving the utility of differentially private model training, but rigorous guarantees require explicit, analyzable factorizations, and practical deployment requires memory efficiency. Recent works have developed banded inverse factorizations, which address both requirements by exploiting a banded structure in the correlation matrix. The bandwidth controls the size of the noise buffer used to correlate noise across iterations, and thus governs the tradeoff between utility and memory cost. Existing factorizations highlight this tradeoff: DP-$λ$CGD achieves high memory efficiency by using only a one-step noise buffer, but this limits its utility gains, while the banded inverse square root (BISR) factorization exploits larger correlation windows and is asymptotically optimal for large bandwidths but performs poorly at low bandwidths. We propose $γ$-BIFR, a unified generalization of both factorizations. In the low-memory, low-bandwidth regime, $γ$-BIFR significantly improves RMSE, amplified RMSE, and private training performance, while yielding tighter theoretical guarantees for multi-participation error in multi-epoch training.