Revisiting mean estimation over $ell_p$ balls: Is the MLE optimal?

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This paper investigates mean estimation under an ℓₚ-ball constraint in Gaussian noise, focusing on the statistical optimality of the maximum likelihood estimator (MLE) for 1 < p < 2. Through minimax analysis and constructive lower bound techniques, it establishes— for the first time—that the MLE suffers polynomially larger risk than the minimax optimal estimator in this regime, implying a suboptimal convergence rate. To address this deficiency, the authors propose a nonlinear shrinkage-based estimator that achieves the minimax optimal rate under the same ℓₚ constraint. The work precisely characterizes the critical boundary governing MLE’s optimality versus suboptimality—depending on p, signal-to-noise ratio, and dimension—and challenges two prevailing beliefs: that the MLE is inherently optimal and that nonlinear estimators universally dominate linear ones. It reveals a nuanced, nontrivial interplay between constraint geometry and estimation strategy.

Technology Category

Application Category

📝 Abstract

We revisit the problem of mean estimation on $ell_p$ balls under additive Gaussian noise. When $p$ is strictly less than $2$, it is well understood that rate-optimal estimators must be nonlinear in the observations. In this work, we study the maximum likelihood estimator (MLE), which may be viewed as a nonlinear shrinkage procedure for mean estimation over $ell_p$ balls. We demonstrate two phenomena for the behavior of the MLE, which depend on the noise level, the radius of the norm constraint, the dimension, and the norm index $p$. First, as a function of the dimension, for $p$ near $1$ or at least $2$, the MLE is minimax rate-optimal for all noise levels and all constraint radii. On the other hand, for $p$ between $1$ and $2$, there is a more striking behavior: for essentially all noise levels and radii for which nonlinear estimates are required, the MLE is minimax rate-suboptimal, despite being nonlinear in the observations. Our results also imply similar conclusions when given $n$ independent and identically distributed Gaussian samples, where we demonstrate that the MLE can be suboptimal by a polynomial factor in the sample size. Our lower bounds are constructive: whenever the MLE is rate-suboptimal, we provide explicit instances on which the MLE provably incurs suboptimal risk.

Problem

Research questions and friction points this paper is trying to address.

Assess MLE's optimality for mean estimation on l_p balls

Compare MLE performance across different p values and dimensions

Identify scenarios where MLE is suboptimal despite nonlinearity

Innovation

Methods, ideas, or system contributions that make the work stand out.

MLE as nonlinear shrinkage for mean estimation

MLE optimal near p=1 or p≥2

MLE suboptimal for p between 1 and 2

🔎 Similar Papers

Estimation of multiple mean vectors in high dimension