Omnipredicting Single-Index Models with Multi-Index Models

📅 2024-11-20

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Existing super-predictors suffer from high sample complexity, long training times, and uninterpretable predictions, making it infeasible to achieve ε-competitive prediction against bounded linear predictors under arbitrary monotone Lipschitz losses. This work introduces efficient *omnipredictors*: (1) the first practical omnipredictor for single-metric models; (2) an improved Isotron algorithm with sharp analysis under non-realizable settings; and (3) a multi-metric model with ≈ε⁻² prediction heads that approximates proper omniprediction. The theoretical framework leverages the class of monotone Lipschitz losses within the agnostic learning setting. Our approach reduces sample complexity to ≈ε⁻⁴ (and to ≈ε⁻² under bi-Lipschitz losses), a substantial improvement over prior ε⁻¹⁰ bounds; achieves nearly linear runtime; and outputs compact, interpretable multi-index predictors—enabling both statistical efficiency and practical deployability.

Technology Category

Application Category

📝 Abstract

Recent work on supervised learning [GKR+22] defined the notion of omnipredictors, i.e., predictor functions $p$ over features that are simultaneously competitive for minimizing a family of loss functions $mathcal{L}$ against a comparator class $mathcal{C}$. Omniprediction requires approximating the Bayes-optimal predictor beyond the loss minimization paradigm, and has generated significant interest in the learning theory community. However, even for basic settings such as agnostically learning single-index models (SIMs), existing omnipredictor constructions require impractically-large sample complexities and runtimes, and output complex, highly-improper hypotheses. Our main contribution is a new, simple construction of omnipredictors for SIMs. We give a learner outputting an omnipredictor that is $varepsilon$-competitive on any matching loss induced by a monotone, Lipschitz link function, when the comparator class is bounded linear predictors. Our algorithm requires $approx varepsilon^{-4}$ samples and runs in nearly-linear time, and its sample complexity improves to $approx varepsilon^{-2}$ if link functions are bi-Lipschitz. This significantly improves upon the only prior known construction, due to [HJKRR18, GHK+23], which used $gtrsim varepsilon^{-10}$ samples. We achieve our construction via a new, sharp analysis of the classical Isotron algorithm [KS09, KKKS11] in the challenging agnostic learning setting, of potential independent interest. Previously, Isotron was known to properly learn SIMs in the realizable setting, as well as constant-factor competitive hypotheses under the squared loss [ZWDD24]. As they are based on Isotron, our omnipredictors are multi-index models with $approx varepsilon^{-2}$ prediction heads, bringing us closer to the tantalizing goal of proper omniprediction for general loss families and comparators.

Problem

Research questions and friction points this paper is trying to address.

Superpredictors

Single Index Model

Computational Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Super-predictor

Single-index model

Improved Isotron algorithm

🔎 Similar Papers

No similar papers found.