SQUAD: Scalable Quorum Adaptive Decisions via ensemble of early exit neural networks

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the limitations of conventional early-exit neural networks, which rely on a single confidence threshold that is often poorly calibrated and struggles to balance accuracy with inference efficiency. To overcome this, the authors propose a quorum-based early-exit mechanism integrated with distributed ensemble learning, where intermediate predictions are aggregated in order of increasing computational complexity and exit decisions are made dynamically. The study introduces, for the first time, a quorum-driven decision criterion and designs QUEST—a neural architecture search method—to optimize hierarchical diversity among intermediate early-exit learners, thereby enhancing their complementarity. Experiments demonstrate that the proposed approach achieves up to a 5.95% higher accuracy than existing dynamic inference methods at comparable computational cost, and reduces inference latency by up to 70.60% compared to static ensembles while maintaining superior accuracy.

Technology Category

Application Category

📝 Abstract

Early-exit neural networks have become popular for reducing inference latency by allowing intermediate predictions when sufficient confidence is achieved. However, standard approaches typically rely on single-model confidence thresholds, which are frequently unreliable due to inherent calibration issues. To address this, we introduce SQUAD (Scalable Quorum Adaptive Decisions), the first inference scheme that integrates early-exit mechanisms with distributed ensemble learning, improving uncertainty estimation while reducing the inference time. Unlike traditional methods that depend on individual confidence scores, SQUAD employs a quorum-based stopping criterion on early-exit learners by collecting intermediate predictions incrementally in order of computational complexity until a consensus is reached and halting the computation at that exit if the consensus is statistically significant. To maximize the efficacy of this voting mechanism, we also introduce QUEST (Quorum Search Technique), a Neural Architecture Search method to select early-exit learners with optimized hierarchical diversity, ensuring learners are complementary at every intermediate layer. This consensus-driven approach yields statistically robust early exits, improving the test accuracy up to 5.95% compared to state-of-the-art dynamic solutions with a comparable computational cost and reducing the inference latency up to 70.60% compared to static ensembles while maintaining a good accuracy.

Problem

Research questions and friction points this paper is trying to address.

early-exit neural networks

inference latency

confidence calibration

uncertainty estimation

ensemble learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

early-exit neural networks

ensemble learning

quorum-based consensus