π€ AI Summary
This work addresses the rejection mechanism in selective classification: achieving high-accuracy abstention without modifying model architecture or loss functions, solely by leveraging the dynamic prediction consistency of intermediate models during training on test samples. The method introduces a novel discretized training-trajectory analysis based on the *stabilization time* of predicted labelsβi.e., the earliest epoch at which an ensemble of intermediate models consistently assigns the same label to a given sample. Rejection thresholds are dynamically calibrated by monitoring this label stability across training epochs. Crucially, the approach requires no specialized loss terms, architectural modifications, or structural constraints. Evaluated on standard benchmarks, it achieves state-of-the-art trade-offs between accuracy and coverage, significantly outperforming existing methods that rely on model redesign or custom loss functions.
π Abstract
Selective classification is the task of rejecting inputs a model would predict incorrectly on through a trade-off between input space coverage and model accuracy. Current methods for selective classification impose constraints on either the model architecture or the loss function; this inhibits their usage in practice. In contrast to prior work, we show that state-of-the-art selective classification performance can be attained solely from studying the (discretized) training dynamics of a model. We propose a general framework that, for a given test input, monitors metrics capturing the disagreement with the final predicted label over intermediate models obtained during training; we then reject data points exhibiting too much disagreement at late stages in training. In particular, we instantiate a method that tracks when the label predicted during training stops disagreeing with the final predicted label. Our experimental evaluation shows that our method achieves state-of-the-art accuracy/coverage trade-offs on typical selective classification benchmarks.