SQUAD: Scalable Quorum Adaptive Decisions via ensemble of early exit neural networks

๐Ÿ“… 2026-01-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limitations of conventional early-exit neural networks, which rely on a single confidence threshold that is often poorly calibrated and struggles to balance accuracy with inference efficiency. To overcome this, the authors propose a quorum-based early-exit mechanism integrated with distributed ensemble learning, where intermediate predictions are aggregated in order of increasing computational complexity and exit decisions are made dynamically. The study introduces, for the first time, a quorum-driven decision criterion and designs QUESTโ€”a neural architecture search methodโ€”to optimize hierarchical diversity among intermediate early-exit learners, thereby enhancing their complementarity. Experiments demonstrate that the proposed approach achieves up to a 5.95% higher accuracy than existing dynamic inference methods at comparable computational cost, and reduces inference latency by up to 70.60% compared to static ensembles while maintaining superior accuracy.

Technology Category

Application Category

๐Ÿ“ Abstract
Early-exit neural networks have become popular for reducing inference latency by allowing intermediate predictions when sufficient confidence is achieved. However, standard approaches typically rely on single-model confidence thresholds, which are frequently unreliable due to inherent calibration issues. To address this, we introduce SQUAD (Scalable Quorum Adaptive Decisions), the first inference scheme that integrates early-exit mechanisms with distributed ensemble learning, improving uncertainty estimation while reducing the inference time. Unlike traditional methods that depend on individual confidence scores, SQUAD employs a quorum-based stopping criterion on early-exit learners by collecting intermediate predictions incrementally in order of computational complexity until a consensus is reached and halting the computation at that exit if the consensus is statistically significant. To maximize the efficacy of this voting mechanism, we also introduce QUEST (Quorum Search Technique), a Neural Architecture Search method to select early-exit learners with optimized hierarchical diversity, ensuring learners are complementary at every intermediate layer. This consensus-driven approach yields statistically robust early exits, improving the test accuracy up to 5.95% compared to state-of-the-art dynamic solutions with a comparable computational cost and reducing the inference latency up to 70.60% compared to static ensembles while maintaining a good accuracy.
Problem

Research questions and friction points this paper is trying to address.

early-exit neural networks
inference latency
confidence calibration
uncertainty estimation
ensemble learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

early-exit neural networks
ensemble learning
quorum-based consensus
Neural Architecture Search
adaptive inference
๐Ÿ”Ž Similar Papers
No similar papers found.