Scaling up the Banded Matrix Factorization Mechanism for Differentially Private ML

📅 2024-05-24
🏛️ arXiv.org
📈 Citations: 6
Influential: 2
📄 PDF
🤖 AI Summary
To address the memory and computational intractability of differentially private banded matrix factorization (DP-BandMF) in large-scale settings—where parameters exceed 10⁷ and iterations surpass 10⁴—this paper proposes the first DP-BandMF variant with *unbounded scalability* in both parameter dimensionality and training length. Methodologically, we integrate block-wise low-rank approximation, iterative noise reuse, and sparse banded structure optimization to design a distributed DP-BandMF framework, accompanied by a rigorous privacy–utility trade-off analysis. Experiments on CIFAR-10 and language modeling tasks demonstrate that our approach matches the utility of standard DP-BandMF while reducing memory consumption by 92% and accelerating training by 8.3×. Crucially, it supports arbitrarily large model sizes and training durations, thereby overcoming the fundamental complexity bottlenecks inherent in prior DP-BandMF implementations.

Technology Category

Application Category

📝 Abstract
Correlated noise mechanisms such as DP Matrix Factorization (DP-MF) have proven to be effective alternatives to DP-SGD in large-epsilon few-epoch training regimes. Significant work has been done to find the best correlated noise strategies, and the current state-of-the-art approach is DP-BandMF, which optimally balances the benefits of privacy amplification and noise correlation. Despite it's utility advantages, severe scalability limitations prevent this mechanism from handling large-scale training scenarios where the number of training iterations may exceed $10^4$ and the number of model parameters may exceed $10^7$. In this work, we present techniques to scale up DP-BandMF along these two dimensions, significantly extending it's reach and enabling it to handle settings with virtually any number of model parameters and training iterations, with negligible utility degradation.
Problem

Research questions and friction points this paper is trying to address.

Scaling DP-BandMF for large model parameters
Extending DP-BandMF for high training iterations
Maintaining utility in large-scale private ML
Innovation

Methods, ideas, or system contributions that make the work stand out.

Scales DP-BandMF for large model parameters
Extends DP-BandMF for high training iterations
Maintains utility with negligible degradation
🔎 Similar Papers
2024-05-22Neural Information Processing SystemsCitations: 6