Enhancing Fractional Gradient Descent with Learned Optimizers

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Fractional Gradient Descent (FGD) suffers from unstable convergence and difficulty in adaptively scheduling hyperparameters—such as the fractional order and step size—in non-convex optimization, particularly lacking dynamic mechanisms for history-dependent hyperparameter adjustment. This work introduces meta-learning to Caputo-type Fractional Gradient Descent (CFGD) for the first time, proposing a differentiable meta-controller that dynamically optimizes its key hyperparameters during training. By synergistically integrating fractional calculus and meta-learning, the method enables history-aware, adaptive hyperparameter tuning. Experiments on multiple benchmark tasks demonstrate that the proposed approach significantly outperforms hand-tuned CFGD in both convergence speed and robustness; in certain scenarios, it matches the performance of fully black-box meta-optimizers. To our knowledge, this is the first differentiable, trainable, and practical meta-learning framework tailored for fractional-order optimization.

Technology Category

Application Category

📝 Abstract

Fractional Gradient Descent (FGD) offers a novel and promising way to accelerate optimization by incorporating fractional calculus into machine learning. Although FGD has shown encouraging initial results across various optimization tasks, it faces significant challenges with convergence behavior and hyperparameter selection. Moreover, the impact of its hyperparameters is not fully understood, and scheduling them is particularly difficult in non-convex settings such as neural network training. To address these issues, we propose a novel approach called Learning to Optimize Caputo Fractional Gradient Descent (L2O-CFGD), which meta-learns how to dynamically tune the hyperparameters of Caputo FGD (CFGD). Our method's meta-learned schedule outperforms CFGD with static hyperparameters found through an extensive search and, in some tasks, achieves performance comparable to a fully black-box meta-learned optimizer. L2O-CFGD can thus serve as a powerful tool for researchers to identify high-performing hyperparameters and gain insights on how to leverage the history-dependence of the fractional differential in optimization.

Problem

Research questions and friction points this paper is trying to address.

Addressing convergence issues in Fractional Gradient Descent optimization methods

Solving hyperparameter selection challenges in non-convex neural network training

Overcoming difficulties in scheduling fractional calculus hyperparameters dynamically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learns dynamic hyperparameter tuning for Caputo FGD

Outperforms static hyperparameters through learned schedules

Provides insights into fractional differential history-dependence optimization

🔎 Similar Papers

No similar papers found.