HOME-3: High-Order Momentum Estimator with Third-Power Gradient for Convex and Smooth Nonconvex Optimization

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses slow convergence and susceptibility to stationary points in both convex and smooth nonconvex optimization. We propose HOME-3, a higher-order momentum estimator based on the cubic power of first-order gradients. To our knowledge, this is the first systematic incorporation of third-power gradient terms into momentum construction, coupled with a gradient-weighted update mechanism to enhance directional discrimination. Theoretically, we employ Lyapunov function analysis to establish tightened convergence bounds and extend the framework to nonsmooth nonconvex settings. Empirical evaluations demonstrate that HOME-3 consistently outperforms mainstream optimizers—including Adam and SGD with momentum—across convex optimization, smooth and nonsmooth nonconvex tasks (e.g., deep neural network training). It achieves up to 2.1× faster convergence, improved generalization stability, and superior saddle-point escape capability.

Technology Category

Application Category

📝 Abstract

Momentum-based gradients are essential for optimizing advanced machine learning models, as they not only accelerate convergence but also advance optimizers to escape stationary points. While most state-of-the-art momentum techniques utilize lower-order gradients, such as the squared first-order gradient, there has been limited exploration of higher-order gradients, particularly those raised to powers greater than two. In this work, we introduce the concept of high-order momentum, where momentum is constructed using higher-power gradients, with a focus on the third-power of the first-order gradient as a representative case. Our research offers both theoretical and empirical support for this approach. Theoretically, we demonstrate that incorporating third-power gradients can improve the convergence bounds of gradient-based optimizers for both convex and smooth nonconvex problems. Empirically, we validate these findings through extensive experiments across convex, smooth nonconvex, and nonsmooth nonconvex optimization tasks. Across all cases, high-order momentum consistently outperforms conventional low-order momentum methods, showcasing superior performance in various optimization problems.

Problem

Research questions and friction points this paper is trying to address.

Exploring high-order momentum using third-power gradients for optimization

Improving convergence bounds in convex and nonconvex optimization tasks

Outperforming conventional low-order momentum methods empirically

Innovation

Methods, ideas, or system contributions that make the work stand out.

High-order momentum with third-power gradient

Improved convergence bounds for optimizers

Outperforms conventional low-order momentum methods

🔎 Similar Papers

Convergence of SGD with momentum in the nonconvex case: A novel time window-based analysis