🤖 AI Summary
Edge AI inference faces a fundamental trade-off among low latency, high accuracy, and strong fault tolerance in resource-constrained and failure-prone environments. To address this, we propose MEL, a multi-level ensemble learning framework that jointly trains a set of lightweight heterogeneous models. MEL innovatively integrates diversity-aware learning with individual model performance optimization, enabling collaborative representation refinement across the ensemble. A multi-objective loss function explicitly encourages inter-model diversity, while lightweight architectures and adaptive ensemble mechanisms jointly ensure fault resilience and deployment flexibility. Experiments demonstrate that MEL achieves accuracy comparable to a baseline monolithic model despite using only 40% of its total parameter count. Under single-point model failure, MEL maintains 95.6% ensemble accuracy—substantially outperforming existing recovery approaches.
📝 Abstract
AI inference at the edge is becoming increasingly common for low-latency services. However, edge environments are power- and resource-constrained, and susceptible to failures. Conventional failure resilience approaches, such as cloud failover or compressed backups, often compromise latency or accuracy, limiting their effectiveness for critical edge inference services. In this paper, we propose Multi-Level Ensemble Learning (MEL), a new framework for resilient edge inference that simultaneously trains multiple lightweight backup models capable of operating collaboratively, refining each other when multiple servers are available, and independently under failures while maintaining good accuracy. Specifically, we formulate our approach as a multi-objective optimization problem with a loss formulation that inherently encourages diversity among individual models to promote mutually refining representations, while ensuring each model maintains good standalone performance. Empirical evaluations across vision, language, and audio datasets show that MEL provides performance comparable to original architectures while also providing fault tolerance and deployment flexibility across edge platforms. Our results show that our ensemble model, sized at 40% of the original model, achieves similar performance, while preserving 95.6% of ensemble accuracy in the case of failures when trained using MEL.