🤖 AI Summary
This work investigates the impact of noise—particularly differential privacy (DP) noise—on spectral-norm approximation of low-rank matrices, with a focus on precise characterization of subspace directional distortion: Frobenius-norm error bounds fail to capture worst-case angular deviation, whereas spectral-norm guarantees are strongest for downstream applications. Methodologically, we transcend the classical Eckart–Young–Mirsky theorem by deriving high-probability spectral perturbation bounds, achieving an up-to-√n improvement in error upper bounds. We introduce a novel contour-based bootstrap technique from complex analysis, extended to matrix exponentials and polynomial spectral functions, and jointly leverage eigenvalue gaps and matrix condition numbers for fine-grained error analysis. Our theoretical advances resolve, for the first time, the long-standing open problem of subspace stability in DP-PCA. Empirical evaluation on real-world datasets confirms that our spectral bounds substantially outperform Frobenius-based analyses, yielding the strongest known utility guarantees for privacy-preserving low-rank learning.
📝 Abstract
A central challenge in machine learning is to understand how noise or measurement errors affect low-rank approximations, particularly in the spectral norm. This question is especially important in differentially private low-rank approximation, where one aims to preserve the top-$p$ structure of a data-derived matrix while ensuring privacy. Prior work often analyzes Frobenius norm error or changes in reconstruction quality, but these metrics can over- or under-estimate true subspace distortion. The spectral norm, by contrast, captures worst-case directional error and provides the strongest utility guarantees. We establish new high-probability spectral-norm perturbation bounds for symmetric matrices that refine the classical Eckart--Young--Mirsky theorem and explicitly capture interactions between a matrix $A in mathbb{R}^{n imes n}$ and an arbitrary symmetric perturbation $E$. Under mild eigengap and norm conditions, our bounds yield sharp estimates for $|(A + E)_p - A_p|$, where $A_p$ is the best rank-$p$ approximation of $A$, with improvements of up to a factor of $sqrt{n}$. As an application, we derive improved utility guarantees for differentially private PCA, resolving an open problem in the literature. Our analysis relies on a novel contour bootstrapping method from complex analysis and extends it to a broad class of spectral functionals, including polynomials and matrix exponentials. Empirical results on real-world datasets confirm that our bounds closely track the actual spectral error under diverse perturbation regimes.