A Private Approximation of the 2nd-Moment Matrix of Any Subsamplable Input

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses robust second-moment matrix estimation under differential privacy: estimating the covariance matrix with high accuracy and strong privacy guarantees, even for worst-case inputs containing a large fraction of outliers, assuming the data satisfy the $(m,alpha,eta)$-subsamplability condition. We introduce subsamplability—previously unexplored in private second-moment estimation—for the first time, and propose a recursive subsampling framework grounded in zero-concentrated differential privacy (zCDP). Our approach integrates spectral structure-preserving analysis with random matrix perturbation techniques. Under zCDP constraints, the algorithm achieves a $(1 pm gamma)$-relative error bound and succeeds with high probability even when the outlier fraction is a constant—significantly outperforming existing generic private covariance estimators. This yields a substantially improved privacy–utility trade-off.

Technology Category

Application Category

📝 Abstract

We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data. We call an input $(m,alpha,eta)$-subsamplable if a random subsample of size $m$ (or larger) preserves w.p $geq 1-eta$ the spectral structure of the original second moment matrix up to a multiplicative factor of $1pm alpha$. Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al 2019, that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p. the accuracy of the second moment estimation upto an arbitrary factor of $(1pmgamma)$. We then show how to apply our algorithm to approximate the second moment matrix of a distribution $mathcal{D}$, even when a noticeable fraction of the input are outliers.

Problem

Research questions and friction points this paper is trying to address.

Differentially private second moment estimation

Preserving spectral structure under subsamplability

Handling outliers in second moment approximation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentially private second moment estimation algorithm

Recursive framework under zCDP for accuracy

Handles outliers in subsamplable data inputs

🔎 Similar Papers

Banded Square Root Matrix Factorization for Differentially Private Model Training