🤖 AI Summary
Edge-device neural network training is severely constrained by limited memory, hindering efficient adaptation to downstream tasks. To address this, we propose MeDyate—a framework for memory-efficient transfer learning under stringent memory budgets. First, we introduce Layer Ranking (LaRa) to quantify parameter importance per layer. Second, we design a dynamic channel resampling strategy guided by the temporal stability of channel importance distributions across epochs, enabling adaptive subnetwork updates. Third, we integrate subnetwork freezing and activation to minimize runtime memory footprint. MeDyate achieves fine-tuning with only hundreds of kB RAM. Evaluated across diverse tasks and architectures, it significantly outperforms existing static and dynamic methods, attaining state-of-the-art accuracy while maintaining ultra-low memory overhead (< 0.5 MB). MeDyate establishes a practical, memory-aware paradigm for on-device transfer learning.
📝 Abstract
On-device neural network training faces critical memory constraints that limit the adaptation of pre-trained models to downstream tasks. We present MeDyate, a theoretically-grounded framework for memory-constrained dynamic subnetwork adaptation. Our approach introduces two key innovations: LaRa (Layer Ranking), an improved layer importance metric that enables principled layer pre-selection, and a dynamic channel sampling strategy that exploits the temporal stability of channel importance distributions during fine-tuning. MeDyate dynamically resamples channels between epochs according to importance-weighted probabilities, ensuring comprehensive parameter space exploration while respecting strict memory budgets. Extensive evaluation across a large panel of tasks and architectures demonstrates that MeDyate achieves state-of-the-art performance under extreme memory constraints, consistently outperforming existing static and dynamic approaches while maintaining high computational efficiency. Our method represents a significant step towards enabling efficient on-device learning by demonstrating effective fine-tuning with memory budgets as low as a few hundred kB of RAM.