🤖 AI Summary
This work addresses the problem of dynamically adjusting the learning rate during training to balance performance gains against effort, instability, and resource consumption. Formulating this as an optimal control problem, the authors derive a closed-loop learning rate scheduling policy that depends solely on current and anticipated future performance, aiming to maximize cumulative performance while minimizing learning cost. The study introduces the first unified framework integrating self-regulated learning, effort allocation, and episodic memory, revealing how cognitive biases influence learning persistence and offering a biologically plausible near-optimal implementation mechanism. Theoretical analysis and numerical simulations demonstrate that the proposed strategy generalizes effectively across diverse tasks and architectures, reproduces results from numerical optimization, and achieves near-optimal learning behavior through a simple memory-based mechanism.
📝 Abstract
Learning how to learn efficiently is a fundamental challenge for biological agents and a growing concern for artificial ones. To learn effectively, an agent must regulate its learning speed, balancing the benefits of rapid improvement against the costs of effort, instability, or resource use. We introduce a normative framework that formalizes this problem as an optimal control process in which the agent maximizes cumulative performance while incurring a cost of learning. From this objective, we derive a closed-form solution for the optimal learning rate, which has the form of a closed-loop controller that depends only on the agent's current and expected future performance. Under mild assumptions, this solution generalizes across tasks and architectures and reproduces numerically optimized schedules in simulations. In simple learning models, we can mathematically analyze how agent and task parameters shape learning-rate scheduling as an open-loop control solution. Because the optimal policy depends on expectations of future performance, the framework predicts how overconfidence or underconfidence influence engagement and persistence, linking the control of learning speed to theories of self-regulated learning. We further show how a simple episodic memory mechanism can approximate the required performance expectations by recalling similar past learning experiences, providing a biologically plausible route to near-optimal behaviour. Together, these results provide a normative and biologically plausible account of learning speed control, linking self-regulated learning, effort allocation, and episodic memory estimation within a unified and tractable mathematical framework.