🤖 AI Summary
This paper resolves the long-standing open problem of establishing minimax-optimal convergence rates for high-dimensional robust stochastic convex optimization (SCO) under the ε-contamination model.
Method: We propose a novel algorithmic framework integrating a new robust estimator design, explicit exploitation of population-level smoothness (without requiring Lipschitz or smoothness of individual sample functions), convolution-based smoothing, and information-theoretic lower bound construction.
Results: Under mild assumptions—namely, only smoothness of the population loss, unknown covariance, and extensibility to nonsmooth population risks—we achieve minimax-optimal excess risk up to logarithmic factors. We establish a tight information-theoretic lower bound and match it with a corresponding upper bound. Our method significantly outperforms existing approaches in both theoretical guarantees and practical applicability, providing the first unified, optimal, and practically implementable theoretical framework for high-dimensional learning under structured outliers.
📝 Abstract
Machine learning algorithms in high-dimensional settings are highly susceptible to the influence of even a small fraction of structured outliers, making robust optimization techniques essential. In particular, within the $epsilon$-contamination model, where an adversary can inspect and replace up to an $epsilon$-fraction of the samples, a fundamental open problem is determining the optimal rates for robust stochastic convex optimization (SCO) under such contamination. We develop novel algorithms that achieve minimax-optimal excess risk (up to logarithmic factors) under the $epsilon$-contamination model. Our approach improves over existing algorithms, which are not only suboptimal but also require stringent assumptions, including Lipschitz continuity and smoothness of individual sample functions. By contrast, our optimal algorithms do not require these stringent assumptions, assuming only population-level smoothness of the loss. Moreover, our algorithms can be adapted to handle the case in which the covariance parameter is unknown, and can be extended to nonsmooth population risks via convolutional smoothing. We complement our algorithmic developments with a tight information-theoretic lower bound for robust SCO.