π€ AI Summary
This paper addresses distributional uncertainty in machine learning arising from discrepancies between training and true population distributions. We identify fundamental limitations of three prevailing paradigms: Bayesian methods (sensitive to prior specification), distributionally robust optimization (DRO) (overly conservative), and regularization-based approaches (biased estimation). To overcome these, we propose the first unified learning framework that jointly integrates asymptotic analysis (establishing consistency and asymptotic normality), non-asymptotic theory (deriving tight generalization bounds and proving unbiasedness), and computationally tractable optimization algorithms. Our framework simultaneously resolves the challenges of prior/regularizer selection, excessive DRO conservatism, and estimation bias, while formally characterizing the intrinsic trade-off between robustness and specificityβa theoretical insight previously unreported. Extensive experiments across multiple real-world tasks demonstrate significant improvements over state-of-the-art baselines, empirically validating both theoretical rigor and practical efficacy.
π Abstract
Trustworthy machine learning aims at combating distributional uncertainties in training data distributions compared to population distributions. Typical treatment frameworks include the Bayesian approach, (min-max) distributionally robust optimization (DRO), and regularization. However, three issues have to be raised: 1) the prior distribution in the Bayesian method and the regularizer in the regularization method are difficult to specify; 2) the DRO method tends to be overly conservative; 3) all the three methods are biased estimators of the true optimal cost. This paper studies a new framework that unifies the three approaches and addresses the three challenges above. The asymptotic properties (e.g., consistencies and asymptotic normalities), non-asymptotic properties (e.g., generalization bounds and unbiasedness), and solution methods of the proposed model are studied. The new model reveals the trade-off between the robustness to the unseen data and the specificity to the training data. Experiments on various real-world tasks validate the superiority of the proposed learning framework.