One Period to Rule Them All: Identifying Critical Learning Periods in Deep Networks

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Prior work has established the existence of critical learning periods in deep learning but lacks precise temporal localization of these phases. This paper introduces the first systematic framework that jointly analyzes generalization capability prediction and training dynamics trajectories to accurately identify the critical learning period—i.e., the time interval most decisive for final model generalization—and dynamically deactivate high-overhead regularization techniques (e.g., data augmentation) thereafter. Integrated with data pruning and resource-aware scheduling, the approach enables temporally adaptive computational cost control. Evaluated on standard architectures, it achieves up to 59.67% reduction in training time, 59.47% lower CO₂ emissions, and 60% cost savings, with zero accuracy degradation. The core contributions are: (1) the first computationally grounded methodology for pinpointing critical learning periods, and (2) a generalization-oriented paradigm for dynamic, time-aware regularization strategy adaptation.

Technology Category

Application Category

📝 Abstract
Critical Learning Periods comprehend an important phenomenon involving deep learning, where early epochs play a decisive role in the success of many training recipes, such as data augmentation. Existing works confirm the existence of this phenomenon and provide useful insights. However, the literature lacks efforts to precisely identify when critical periods occur. In this work, we fill this gap by introducing a systematic approach for identifying critical periods during the training of deep neural networks, focusing on eliminating computationally intensive regularization techniques and effectively applying mechanisms for reducing computational costs, such as data pruning. Our method leverages generalization prediction mechanisms to pinpoint critical phases where training recipes yield maximum benefits to the predictive ability of models. By halting resource-intensive recipes beyond these periods, we significantly accelerate the learning phase and achieve reductions in training time, energy consumption, and CO$_2$ emissions. Experiments on standard architectures and benchmarks confirm the effectiveness of our method. Specifically, we achieve significant milestones by reducing the training time of popular architectures by up to 59.67%, leading to a 59.47% decrease in CO$_2$ emissions and a 60% reduction in financial costs, without compromising performance. Our work enhances understanding of training dynamics and paves the way for more sustainable and efficient deep learning practices, particularly in resource-constrained environments. In the era of the race for foundation models, we believe our method emerges as a valuable framework. The repository is available at https://github.com/baunilhamarga/critical-periods
Problem

Research questions and friction points this paper is trying to address.

Identify critical learning periods in deep networks
Reduce computational costs and training time
Minimize CO2 emissions and energy consumption
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic approach to identify critical learning periods
Leverages generalization prediction for maximum training benefits
Reduces training time, energy, and CO2 emissions significantly
🔎 Similar Papers
No similar papers found.
V
Vinicius Yuiti Fukase
Escola Politécnica, Universidade de São Paulo, Brazil
H
Heitor Gama
Escola Politécnica, Universidade de São Paulo, Brazil
B
Barbara Bueno
Escola Politécnica, Universidade de São Paulo, Brazil
L
Lucas Libanio
Escola Politécnica, Universidade de São Paulo, Brazil
Anna Helena Reali Costa
Anna Helena Reali Costa
Full Professor of Computer Engineering, Universidade de São Paulo
Artificial IntelligenceMachine LearningReinforcement LearningIntelligent Robotics
Artur Jordao
Artur Jordao
Universidade de São Paulo (USP)
Machine LearningPartial Least SquaresPattern Recognition