On the Optimality of the Median-of-Means Estimator under Adversarial Contamination

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

This paper investigates the robustness and minimax optimality of the Median-of-Means (MoM) estimator under adversarial contamination, focusing on non-Gaussian settings: distributions with finite variance and heavy-tailed distributions with finite absolute $(1+r)$-th moment ($r in (0,1]$) but infinite variance. Using probabilistic analysis and statistical learning techniques, we establish, for the first time, that MoM achieves the minimax optimal convergence rate in both regimes—providing matching upper and lower bounds on estimation error. In contrast, we show that MoM is suboptimal in light-tailed settings (e.g., sub-Gaussian distributions), where its rate degrades relative to the minimax benchmark. Our analysis systematically delineates the applicability boundary of MoM, revealing its intrinsic “heavy-tail optimal, light-tail suboptimal” trade-off. These results provide foundational theoretical support for robust estimation and establish distribution-dependent performance baselines critical for method selection and design.

Technology Category

Application Category

📝 Abstract

The Median-of-Means (MoM) is a robust estimator widely used in machine learning that is known to be (minimax) optimal in scenarios where samples are i.i.d. In more grave scenarios, samples are contaminated by an adversary that can inspect and modify the data. Previous work has theoretically shown the suitability of the MoM estimator in certain contaminated settings. However, the (minimax) optimality of MoM and its limitations under adversarial contamination remain unknown beyond the Gaussian case. In this paper, we present upper and lower bounds for the error of MoM under adversarial contamination for multiple classes of distributions. In particular, we show that MoM is (minimax) optimal in the class of distributions with finite variance, as well as in the class of distributions with infinite variance and finite absolute $(1+r)$-th moment. We also provide lower bounds for MoM's error that match the order of the presented upper bounds, and show that MoM is sub-optimal for light-tailed distributions.

Problem

Research questions and friction points this paper is trying to address.

Establishes minimax optimality of Median-of-Means under adversarial contamination

Analyzes error bounds for distributions with finite and infinite variance

Demonstrates sub-optimal performance in light-tailed distribution scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

MoM estimator optimal for finite variance distributions

MoM optimal for infinite variance with finite moments

MoM sub-optimal for light-tailed distribution classes

🔎 Similar Papers

Adversarially robust clustering with optimality guarantees