Sample Complexity Bounds for Robust Mean Estimation with Mean-Shift Contamination

๐Ÿ“… 2026-02-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the problem of robust mean estimation under the mean-shift contamination model, where an adversary replaces a small fraction of clean samples with arbitrarily shifted samples from the underlying distribution. For general multivariate base distributions whose characteristic functions satisfy mild spectral conditions, the authors propose an efficient Fourier-analytic algorithm capable of achieving mean estimation with arbitrary accuracy. The key innovation lies in the introduction of a โ€œFourier witnessโ€ as a central analytical tool, which enables the first nearly tight characterization of sample complexity for this setting. The derived upper and lower bounds match up to constant factors, thereby revealing the optimal sample efficiency achievable by any robust estimator under this contamination model.

Technology Category

Application Category

๐Ÿ“ Abstract
We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the (potentially multivariate) base distribution, there exists a sample-efficient algorithm that estimates the target mean to any desired accuracy. We complement our upper bound with a qualitatively matching sample complexity lower bound. Our techniques make critical use of Fourier analysis, and in particular introduce the notion of a Fourier witness as an essential ingredient of our upper and lower bounds.
Problem

Research questions and friction points this paper is trying to address.

mean estimation
mean-shift contamination
sample complexity
robust statistics
characteristic function
Innovation

Methods, ideas, or system contributions that make the work stand out.

mean-shift contamination
sample complexity
Fourier witness
robust mean estimation
characteristic function
๐Ÿ”Ž Similar Papers
No similar papers found.