One Period to Rule Them All: Identifying Critical Learning Periods in Deep Networks

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Prior work has established the existence of critical learning periods in deep learning but lacks precise temporal localization of these phases. This paper introduces the first systematic framework that jointly analyzes generalization capability prediction and training dynamics trajectories to accurately identify the critical learning period—i.e., the time interval most decisive for final model generalization—and dynamically deactivate high-overhead regularization techniques (e.g., data augmentation) thereafter. Integrated with data pruning and resource-aware scheduling, the approach enables temporally adaptive computational cost control. Evaluated on standard architectures, it achieves up to 59.67% reduction in training time, 59.47% lower CO₂ emissions, and 60% cost savings, with zero accuracy degradation. The core contributions are: (1) the first computationally grounded methodology for pinpointing critical learning periods, and (2) a generalization-oriented paradigm for dynamic, time-aware regularization strategy adaptation.

Technology Category

Application Category

📝 Abstract

Critical Learning Periods comprehend an important phenomenon involving deep learning, where early epochs play a decisive role in the success of many training recipes, such as data augmentation. Existing works confirm the existence of this phenomenon and provide useful insights. However, the literature lacks efforts to precisely identify when critical periods occur. In this work, we fill this gap by introducing a systematic approach for identifying critical periods during the training of deep neural networks, focusing on eliminating computationally intensive regularization techniques and effectively applying mechanisms for reducing computational costs, such as data pruning. Our method leverages generalization prediction mechanisms to pinpoint critical phases where training recipes yield maximum benefits to the predictive ability of models. By halting resource-intensive recipes beyond these periods, we significantly accelerate the learning phase and achieve reductions in training time, energy consumption, and CO$_2$ emissions. Experiments on standard architectures and benchmarks confirm the effectiveness of our method. Specifically, we achieve significant milestones by reducing the training time of popular architectures by up to 59.67%, leading to a 59.47% decrease in CO$_2$ emissions and a 60% reduction in financial costs, without compromising performance. Our work enhances understanding of training dynamics and paves the way for more sustainable and efficient deep learning practices, particularly in resource-constrained environments. In the era of the race for foundation models, we believe our method emerges as a valuable framework. The repository is available at https://github.com/baunilhamarga/critical-periods

Problem

Research questions and friction points this paper is trying to address.

Identify critical learning periods in deep networks

Reduce computational costs and training time

Minimize CO2 emissions and energy consumption

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic approach to identify critical learning periods

Leverages generalization prediction for maximum training benefits

Reduces training time, energy, and CO2 emissions significantly

🔎 Similar Papers

BOWL: A Deceptively Simple Open World Learner