🤖 AI Summary
This paper identifies a “multiple descent” phenomenon—characterized by recurrent oscillations in test loss—during LSTM training, revealing its origin in periodic phase transitions between ordered and chaotic dynamical regimes. Method: Leveraging asymptotic stability analysis, Lyapunov spectrum estimation, and training trajectory modeling, the authors rigorously characterize these transitions and establish their causal link to multiple descent. Contribution/Results: The study is the first to attribute multiple descent to a sequence of order–chaos phase transitions and demonstrates that optimal generalization occurs precisely at the first order-to-chaos critical transition—the onset of the widest “edge of chaos.” Based on this insight, the authors propose a novel early-stopping criterion grounded in dynamical phase-transition detection. Empirical validation across multiple time-series benchmarks confirms strict synchrony between multiple descent and phase transitions; locating the first transition yields a 12.7% improvement in generalization performance over conventional early-stopping strategies.
📝 Abstract
We observe a novel 'multiple-descent' phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local optimal epochs are consistently at the critical transition point between the two phases. More importantly, the global optimal epoch occurs at the first transition from order to chaos, where the 'width' of the 'edge of chaos' is the widest, allowing the best exploration of better weight configurations for learning.