Probabilistic learning rate scheduler with provable convergence

📅 2024-07-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Conventional learning rate schedulers accelerate convergence empirically but lack theoretical convergence guarantees due to their non-monotonic behavior. Method: We propose the Probabilistic Learning Rate Scheduler (PLRS), the first framework to establish rigorous global convergence for non-monotonic scheduling. PLRS models step-size selection as a stochastic process and employs Lyapunov stability analysis to design a probabilistic sampling mechanism—obviating the need for monotonic decay assumptions. Contribution/Results: PLRS breaks the long-standing theoretical bottleneck restricting convergence analysis to constant or monotonically decreasing learning rates, thereby bridging the gap between theory and practice. Experiments across diverse datasets and neural architectures demonstrate that PLRS achieves training speed and accuracy comparable to state-of-the-art schedulers, while providing verifiable, theoretically grounded convergence guarantees.

Technology Category

Application Category

📝 Abstract

Learning rate schedulers have shown great success in speeding up the convergence of learning algorithms in practice. However, their convergence to a minimum has not been proven theoretically. This difficulty mainly arises from the fact that, while traditional convergence analysis prescribes to monotonically decreasing (or constant) learning rates, schedulers opt for rates that often increase and decrease through the training epochs. In this work, we aim to bridge the gap by proposing a probabilistic learning rate scheduler (PLRS), that does not conform to the monotonically decreasing condition, with provable convergence guarantees. In addition to providing detailed convergence proofs, we also show experimental results where the proposed PLRS performs competitively as other state-of-the-art learning rate schedulers across a variety of datasets and architectures.

Problem

Research questions and friction points this paper is trying to address.

Bridging theoretical convergence guarantees with practical learning rate schedulers

Proposing a probabilistic learning rate scheduler (PLRS) with non-monotonic rates

Demonstrating PLRS outperforms state-of-the-art schedulers in accuracy and stability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic learning rate scheduler (PLRS)

Non-monotonic learning rate adjustment

Provable convergence guarantees

🔎 Similar Papers

No similar papers found.