Beyond Greedy Exits: Improved Early Exit Decisions for Risk Control and Reliability

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

Static confidence thresholds in early-exit mechanisms often induce overconfident misclassifications and exhibit poor robustness to distribution shifts. To address this, we propose a dynamic adaptive exit framework grounded in multi-armed bandits (MAB), which learns optimal exit thresholds online. Our approach introduces a novel reward function that jointly accounts for prediction confidence and reliability, enabling uncertainty-aware latency–accuracy trade-offs. By integrating unsupervised online learning with uncertainty modeling, the framework continuously refines exit policies without requiring ground-truth labels. Evaluated on vision-language understanding, text generation, and classification tasks, our method achieves 1.70–2.10× inference speedup with less than 2% accuracy degradation—substantially outperforming static-threshold baselines. The framework thus delivers superior efficiency, reliability, and deployment robustness under distributional shifts.

Technology Category

Application Category

📝 Abstract

Early-Exit Deep Neural Networks enable adaptive inference by allowing prediction at intermediary layers, significantly reducing computational costs and latency. Most of the early exit strategies greedily exit a sample at an intermediary layer if the confidence in class prediction exceeds a predefined threshold that is set using a static validation set. This is problematic as the model might be overconfident in a wrong class. Also, they are not robust to distribution shifts encountered in deployment, which can undermine model trustworthiness and accuracy. To address these challenges, we propose UAT that adapts the threshold for exit decisions using a Multi-Armed Bandit framework, enabling online, unsupervised adjustment of exit decisions. UAT makes decisions based on a new reward function that assesses predictive certainty and its reliability to balance computational efficiency and prediction quality while penalizing unnecessary late exits. We provide guarantees on risk achieved by UAT and validate its performance on diverse tasks spanning vision-language understanding, text generation, and classification. Our framework demonstrates consistent improvements in speedup (1.70-2.10x) with a minimal performance drop (<2%) as compared to full model performance. Our source code is available at https://github.com/Div290/UAT.

Problem

Research questions and friction points this paper is trying to address.

Addresses overconfidence issues in early exit neural networks

Improves robustness against distribution shifts during deployment

Balances computational efficiency with prediction quality guarantees

Innovation

Methods, ideas, or system contributions that make the work stand out.

UAT adapts exit thresholds using Multi-Armed Bandit framework

Reward function balances computational efficiency and prediction quality

Online unsupervised adjustment ensures robustness to distribution shifts

🔎 Similar Papers

Should the Timing of Inspections be Predictable?