🤖 AI Summary
Standard empirical risk minimization (ERM) suffers from inaccurate risk assessment in high-confidence subregions—specifically, large-margin regions in classification and low-variance regions in regression.
Method: This paper proposes a weighted empirical risk minimization (WERM) framework that employs data-dependent weighting functions to prioritize samples based on local confidence.
Contribution/Results: We establish, for the first time under a general “balanceable” Bernstein condition, that WERM achieves subregion-adaptive superiority: its conditional risk bound incorporates a data-dependent constant term, strictly improving upon standard ERM. Theoretical analysis demonstrates that WERM selectively enhances risk control accuracy in high-confidence subregions. Synthetic experiments validate the theory, showing significant improvements in both generalization error and risk estimation within these critical subregions.
📝 Abstract
In this work, we study the weighted empirical risk minimization (weighted ERM) schema, in which an additional data-dependent weight function is incorporated when the empirical risk function is being minimized. We show that under a general ``balanceable"Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions over the one obtained from standard ERM, and the superiority manifests itself through a data-dependent constant term in the error bound. These sub-regions correspond to large-margin ones in classification settings and low-variance ones in heteroscedastic regression settings, respectively. Our findings are supported by evidence from synthetic data experiments.