🤖 AI Summary
This work studies high-precision list-decodable mean estimation: given a dataset comprising an α-fraction of samples from a well-behaved distribution and a (1−α)-fraction of arbitrary outliers, how can estimation accuracy be significantly improved under relaxed list-size constraints? Prior algorithms achieve optimal list size but incur Ω(1/α) ℓ₂ error—insufficient for high-precision applications. We establish, for the first time, an explicit trade-off between list size and estimation error ε. Leveraging information-theoretic analysis and nonconvex optimization, we introduce a novel identifiability framework that avoids intricate sum-of-squares hierarchies. We prove that a list of size exp(O(log²(1/α)/ε²)) suffices to contain a mean estimate with ℓ₂ error ≤ ε. Furthermore, we provide a computationally efficient algorithm achieving polynomial sample complexity and runtime, thereby bridging theoretical guarantees with practical feasibility.
📝 Abstract
In list-decodable learning, we are given a set of data points such that an $α$-fraction of these points come from a nice distribution $D$, for some small $αll 1$, and the goal is to output a short list of candidate solutions, such that at least one element of this list recovers some non-trivial information about $D$. By now, there is a large body of work on this topic; however, while many algorithms can achieve optimal list size in terms of $α$, all known algorithms must incur error which decays, in some cases quite poorly, with $1 / α$. In this paper, we ask if this is inherent: is it possible to trade off list size with accuracy in list-decodable learning? More formally, given $ε> 0$, can we can output a slightly larger list in terms of $α$ and $ε$, but so that one element of this list has error at most $ε$ with the ground truth? We call this problem high-accuracy list-decodable learning. Our main result is that non-trivial high-accuracy guarantees, both information-theoretically and algorithmically, are possible for the canonical setting of list-decodable mean estimation of identity-covariance Gaussians. Specifically, we demonstrate that there exists a list of candidate means of size at most $L = exp left( Oleft( frac{log^2 1 / α}{ε^2}
ight)
ight)$ so that one of the elements of this list has $ell_2$ distance at most $ε$ to the true mean. We also design an algorithm that outputs such a list with runtime and sample complexity $n = d^{O(log L)} + exp exp (widetilde{O}(log L))$. We do so by demonstrating a completely novel proof of identifiability, as well as a new algorithmic way of leveraging this proof without the sum-of-squares hierarchy, which may be of independent technical interest.