High-dimensional estimation with missing data: Statistical and computational limits

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
研究高维数据缺失下的参数估计问题,通过统计与计算复杂性分析,揭示均值和协方差估计存在统计-计算差距,而线性回归则可通过高效算法接近信息论下界。
📝 Abstract
We consider computationally-efficient estimation of population parameters when observations are subject to missing data. In particular, we consider estimation under the realizable contamination model of missing data in which an $ε$ fraction of the observations are subject to an arbitrary (and unknown) missing not at random (MNAR) mechanism. When the true data is Gaussian, we provide evidence towards statistical-computational gaps in several problems. For mean estimation in $\ell_2$ norm, we show that in order to obtain error at most $ρ$, for any constant contamination $ε\in (0, 1)$, (roughly) $n \gtrsim d e^{1/ρ^2}$ samples are necessary and that there is a computationally-inefficient algorithm which achieves this error. On the other hand, we show that any computationally-efficient method within certain popular families of algorithms requires a much larger sample complexity of (roughly) $n \gtrsim d^{1/ρ^2}$ and that there exists a polynomial time algorithm based on sum-of-squares which (nearly) achieves this lower bound. For covariance estimation in relative operator norm, we show that a parallel development holds. Finally, we turn to linear regression with missing observations and show that such a gap does not persist. Indeed, in this setting we show that minimizing a simple, strongly convex empirical risk nearly achieves the information-theoretic lower bound in polynomial time.
Problem

Research questions and friction points this paper is trying to address.

high-dimensional estimation
missing data
statistical-computational gap
MNAR
parameter estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

statistical-computational gap
missing not at random (MNAR)
sum-of-squares algorithm
high-dimensional estimation
realizable contamination model
🔎 Similar Papers
No similar papers found.