๐ค AI Summary
This work addresses the problem of robust mean estimation under the mean-shift contamination model, where an adversary replaces a small fraction of clean samples with arbitrarily shifted samples from the underlying distribution. For general multivariate base distributions whose characteristic functions satisfy mild spectral conditions, the authors propose an efficient Fourier-analytic algorithm capable of achieving mean estimation with arbitrary accuracy. The key innovation lies in the introduction of a โFourier witnessโ as a central analytical tool, which enables the first nearly tight characterization of sample complexity for this setting. The derived upper and lower bounds match up to constant factors, thereby revealing the optimal sample efficiency achievable by any robust estimator under this contamination model.
๐ Abstract
We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the (potentially multivariate) base distribution, there exists a sample-efficient algorithm that estimates the target mean to any desired accuracy. We complement our upper bound with a qualitatively matching sample complexity lower bound. Our techniques make critical use of Fourier analysis, and in particular introduce the notion of a Fourier witness as an essential ingredient of our upper and lower bounds.