🤖 AI Summary
This work addresses the high-risk challenge posed by individual high-loss predictions in machine learning deployment by proposing a distribution-agnostic loss quantile scoring mechanism. Rather than modeling label uncertainty, the approach directly models prediction losses themselves, combining any loss predictor with split-conformal calibration to produce input-dependent, interpretable upper bounds on loss for a fixed predictive function—bounds that are comparable across samples. This method enables risk quantification without assumptions about the loss distribution, facilitating effective risk ranking and high-risk alerting. Experiments across 13 regression benchmarks demonstrate a significant reduction in the frequency of large-loss events, validating the framework’s effectiveness and practical utility in risk control.
📝 Abstract
Modern machine learning models can be accurate on average yet still make mistakes that dominate deployment cost. We introduce Locus, a distribution-free wrapper that produces a per-input loss-scale reliability score for a fixed prediction function. Rather than quantifying uncertainty about the label, Locus models the realized loss of the prediction function using any engine that outputs a predictive distribution for the loss given an input. A simple split-calibration step turns this function into a distribution-free interpretable score that is comparable across inputs and can be read as an upper loss level. The score is useful on its own for ranking, and it can optionally be thresholded to obtain a transparent flagging rule with distribution-free control of large-loss events. Experiments across 13 regression benchmarks show that Locus yields effective risk ranking and reduces large-loss frequency compared to standard heuristics.