🤖 AI Summary
This paper investigates the statistical efficiency of early-stopped mirror descent (MD) for high-dimensional linear regression when the true parameter lies in a known convex body. For the overparameterized regime, we propose an early-stopped MD algorithm employing the Minkowski functional of the convex body as the potential function—implicitly enforcing the convex constraint without explicit projection or regularization. We establish, for the first time, that for any convex body and design matrix, the worst-case statistical risk of this method is no worse than that of the convex-body-constrained least squares estimator—thereby refuting the conventional belief that early stopping inherently suffers from a fundamental statistical efficiency lower bound. Our analysis integrates mirror descent dynamics, convex geometry, and high-dimensional statistical learning to rigorously characterize the statistical risk equivalence between early stopping and explicit convex constraints. The derived risk upper bound is optimal up to a constant factor, confirming that early stopping retains optimal statistical efficiency even under overparameterization.
📝 Abstract
Early-stopped iterative optimization methods are widely used as alternatives to explicit regularization, and direct comparisons between early-stopping and explicit regularization have been established for many optimization geometries. However, most analyses depend heavily on the specific properties of the optimization geometry or strong convexity of the empirical objective, and it remains unclear whether early-stopping could ever be less statistically efficient than explicit regularization for some particular shape constraint, especially in the overparameterized regime. To address this question, we study the setting of high-dimensional linear regression under additive Gaussian noise when the ground truth is assumed to lie in a known convex body and the task is to minimize the in-sample mean squared error. Our main result shows that for any convex body and any design matrix, up to an absolute constant factor, the worst-case risk of unconstrained early-stopped mirror descent with an appropriate potential is at most that of the least squares estimator constrained to the convex body. We achieve this by constructing algorithmic regularizers based on the Minkowski functional of the convex body.