🤖 AI Summary
This paper addresses prediction risk control in high-dimensional linear regression under random design: how to achieve dimension-free, non-asymptotic upper bounds on prediction error without explicitly estimating the high-dimensional covariance matrix. To this end, we propose a novel “error-in-operator” paradigm, which implicitly incorporates the design covariance structure into the empirical risk minimization objective—bypassing explicit covariance estimation. Theoretically, we establish the first dimension-free, non-asymptotic upper bound on prediction error; rigorously prove that auxiliary variables do not inflate the effective dimension; and attain statistically optimal convergence rates. Computationally, the method eliminates dependence on covariance estimation, substantially reducing both statistical and computational complexity. Numerical experiments demonstrate its robustness and efficiency in high-dimensional sparse settings.
📝 Abstract
We consider a problem of high-dimensional linear regression with random design. We suggest a novel approach referred to as error-in-operator which does not estimate the design covariance $Sigma$ directly but incorporates it into empirical risk minimization. We provide an expansion of the excess prediction risk and derive non-asymptotic dimension-free bounds on the leading term and the remainder. This helps us to show that auxiliary variables do not increase the effective dimension of the problem, provided that parameters of the procedure are tuned properly. We also discuss computational aspects of our method and illustrate its performance with numerical experiments.