DyCE: Dynamically Configurable Exiting for deep learning compression and real-time scaling

📅 2024-03-04
🏛️ Future generations computer systems
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously achieving real-time inference and energy efficiency for deep learning models on edge devices, this paper proposes a hardware-aware, fine-grained, dynamically configurable early-exit mechanism. It enables runtime adaptation—selecting optimal exit layers based on instantaneous resource conditions (e.g., latency, power consumption)—to realize deployment-free, real-time model compression and scaling. Unlike static models or monolithic dynamic inference approaches, our method introduces a multi-exit network architecture integrating gradient-sensitivity-driven exit placement, lightweight gating controllers, and an online resource-feedback scheduling algorithm. This establishes the first dynamic configuration paradigm jointly optimizing inference latency, accuracy, and energy consumption. Evaluated on ImageNet, our approach achieves up to 3.2× speedup and 58% energy reduction, with accuracy degradation under 0.8%, while enabling millisecond-level exit-policy switching.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Dynamic model adaptation for varying sample complexity
Runtime performance-complexity trade-off adjustment without redeployment
Generalizable compression and scaling via exit networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic exiting via small intermediate exit networks
Decouples design for easy adaptation to new models
Real-time performance-complexity trade-off adjustment
🔎 Similar Papers
2024-03-26European Conference on Computer VisionCitations: 1
Q
Qingyuan Wang
School of Electrical & Electronic Engineering of University College Dublin, Belfield, Dublin 4, Ireland
B
B. Cardiff
School of Electrical & Electronic Engineering of University College Dublin, Belfield, Dublin 4, Ireland
A
Antoine Frapp'e
Univ. Lille, CNRS, Centrale Lille, Junia, Univ. Polytechnique Hauts-de-France, UMR 8520-IEMN, France
B
Benoît Larras
Univ. Lille, CNRS, Centrale Lille, Junia, Univ. Polytechnique Hauts-de-France, UMR 8520-IEMN, France
Deepu John
Deepu John
University College Dublin
Edge ComputingIoTWearable SensingBiomedical Circuits and Systems