🤖 AI Summary
Under constrained computational budgets, balancing classification accuracy and inference efficiency remains challenging. Method: This paper proposes a协同 mechanism of early exit and rejection, the first to formulate early exiting as a rejection-aware multiclass classification problem. It integrates Bayesian risk minimization with head-level budget consumption modeling to enable end-to-end inference under hard budget constraints. The approach employs exponential weighted probability calibration, confidence-aware exit policies, and budget-aware aggregation, evaluated on ResNet-18 and ConvNeXt. Contribution/Results: Experiments on CIFAR and ImageNet demonstrate significant improvements in accuracy under fixed budget constraints, effectively mitigating the “overthinking” phenomenon—where excessive computation yields diminishing returns—while maintaining high accuracy and substantially improving inference efficiency.
📝 Abstract
The increasing complexity of advanced machine learning models requires innovative approaches to manage computational resources effectively. One such method is the Early Exit strategy, which allows for adaptive computation by providing a mechanism to shorten the processing path for simpler data instances. In this paper, we propose EERO, a new methodology to translate the problem of early exiting to a problem of using multiple classifiers with reject option in order to better select the exiting head for each instance. We calibrate the probabilities of exiting at the different heads using aggregation with exponential weights to guarantee a fixed budget .We consider factors such as Bayesian risk, budget constraints, and head-specific budget consumption. Experimental results, conducted using a ResNet-18 model and a ConvNext architecture on Cifar and ImageNet datasets, demonstrate that our method not only effectively manages budget allocation but also enhances accuracy in overthinking scenarios.