🤖 AI Summary
This paper addresses robust second-moment matrix estimation under differential privacy: estimating the covariance matrix with high accuracy and strong privacy guarantees, even for worst-case inputs containing a large fraction of outliers, assuming the data satisfy the $(m,alpha,eta)$-subsamplability condition. We introduce subsamplability—previously unexplored in private second-moment estimation—for the first time, and propose a recursive subsampling framework grounded in zero-concentrated differential privacy (zCDP). Our approach integrates spectral structure-preserving analysis with random matrix perturbation techniques. Under zCDP constraints, the algorithm achieves a $(1 pm gamma)$-relative error bound and succeeds with high probability even when the outlier fraction is a constant—significantly outperforming existing generic private covariance estimators. This yields a substantially improved privacy–utility trade-off.
📝 Abstract
We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data. We call an input $(m,alpha,eta)$-subsamplable if a random subsample of size $m$ (or larger) preserves w.p $geq 1-eta$ the spectral structure of the original second moment matrix up to a multiplicative factor of $1pm alpha$. Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al 2019, that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p. the accuracy of the second moment estimation upto an arbitrary factor of $(1pmgamma)$. We then show how to apply our algorithm to approximate the second moment matrix of a distribution $mathcal{D}$, even when a noticeable fraction of the input are outliers.