🤖 AI Summary
To address the memory and computational intractability of differentially private banded matrix factorization (DP-BandMF) in large-scale settings—where parameters exceed 10⁷ and iterations surpass 10⁴—this paper proposes the first DP-BandMF variant with *unbounded scalability* in both parameter dimensionality and training length. Methodologically, we integrate block-wise low-rank approximation, iterative noise reuse, and sparse banded structure optimization to design a distributed DP-BandMF framework, accompanied by a rigorous privacy–utility trade-off analysis. Experiments on CIFAR-10 and language modeling tasks demonstrate that our approach matches the utility of standard DP-BandMF while reducing memory consumption by 92% and accelerating training by 8.3×. Crucially, it supports arbitrarily large model sizes and training durations, thereby overcoming the fundamental complexity bottlenecks inherent in prior DP-BandMF implementations.
📝 Abstract
Correlated noise mechanisms such as DP Matrix Factorization (DP-MF) have proven to be effective alternatives to DP-SGD in large-epsilon few-epoch training regimes. Significant work has been done to find the best correlated noise strategies, and the current state-of-the-art approach is DP-BandMF, which optimally balances the benefits of privacy amplification and noise correlation. Despite it's utility advantages, severe scalability limitations prevent this mechanism from handling large-scale training scenarios where the number of training iterations may exceed $10^4$ and the number of model parameters may exceed $10^7$. In this work, we present techniques to scale up DP-BandMF along these two dimensions, significantly extending it's reach and enabling it to handle settings with virtually any number of model parameters and training iterations, with negligible utility degradation.