Optimal Rates for Robust Stochastic Convex Optimization

📅 2024-12-15
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper resolves the long-standing open problem of establishing minimax-optimal convergence rates for high-dimensional robust stochastic convex optimization (SCO) under the ε-contamination model. Method: We propose a novel algorithmic framework integrating a new robust estimator design, explicit exploitation of population-level smoothness (without requiring Lipschitz or smoothness of individual sample functions), convolution-based smoothing, and information-theoretic lower bound construction. Results: Under mild assumptions—namely, only smoothness of the population loss, unknown covariance, and extensibility to nonsmooth population risks—we achieve minimax-optimal excess risk up to logarithmic factors. We establish a tight information-theoretic lower bound and match it with a corresponding upper bound. Our method significantly outperforms existing approaches in both theoretical guarantees and practical applicability, providing the first unified, optimal, and practically implementable theoretical framework for high-dimensional learning under structured outliers.

Technology Category

Application Category

📝 Abstract
Machine learning algorithms in high-dimensional settings are highly susceptible to the influence of even a small fraction of structured outliers, making robust optimization techniques essential. In particular, within the $epsilon$-contamination model, where an adversary can inspect and replace up to an $epsilon$-fraction of the samples, a fundamental open problem is determining the optimal rates for robust stochastic convex optimization (SCO) under such contamination. We develop novel algorithms that achieve minimax-optimal excess risk (up to logarithmic factors) under the $epsilon$-contamination model. Our approach improves over existing algorithms, which are not only suboptimal but also require stringent assumptions, including Lipschitz continuity and smoothness of individual sample functions. By contrast, our optimal algorithms do not require these stringent assumptions, assuming only population-level smoothness of the loss. Moreover, our algorithms can be adapted to handle the case in which the covariance parameter is unknown, and can be extended to nonsmooth population risks via convolutional smoothing. We complement our algorithmic developments with a tight information-theoretic lower bound for robust SCO.
Problem

Research questions and friction points this paper is trying to address.

Achieving optimal rates for robust stochastic convex optimization under ε-contamination
Developing algorithms without stringent Lipschitz and smoothness assumptions
Handling unknown covariance and extending to nonsmooth population risks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Achieves minimax-optimal excess risk
Removes need for stringent assumptions
Adapts to unknown covariance parameters
🔎 Similar Papers
No similar papers found.
C
Changyu Gao
University of Wisconsin-Madison, Madison, WI, USA
Andrew Lowy
Andrew Lowy
Postdoctoral Research Associate, University of Wisconsin-Madison
Differential PrivacyTrustworthy Machine LearningOptimizationMachine Learning
X
Xingyu Zhou
Wayne State University, Detroit, MI, USA
S
Stephen J. Wright
University of Wisconsin-Madison, Madison, WI, USA