Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise

📅 2025-06-12

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

This work studies the first-order oracle complexity of finding approximate stochastic stationary points in unconstrained stochastic optimization under heavy-tailed noise and weak average smoothness—strictly weaker than standard bounded-variance and mean-square Lipschitz assumptions. We propose a parameter-free normalized stochastic first-order algorithm that integrates Polyak step sizes, multi-step extrapolation, and recursive momentum, with fully adaptive tuning of both step sizes and momentum coefficients—requiring no prior knowledge of Lipschitz constants or noise bounds. To our knowledge, this is the first analysis establishing a provable first-order oracle complexity upper bound for momentum-based normalized methods under such weak regularity conditions; the theoretical guarantee matches or improves upon existing state-of-the-art bounds. Numerical experiments confirm the algorithm’s robustness and efficiency under heavy-tailed stochastic noise.

Technology Category

Application Category

📝 Abstract

In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first-order oracle complexity results for finding approximate stochastic stationary points under heavy-tailed noise and weakly average smoothness conditions -- both of which are weaker than the commonly used bounded variance and mean-squared smoothness assumptions. Our complexity bounds either improve upon or match the best-known results in the literature. Numerical experiments are presented to demonstrate the practical effectiveness of the proposed methods.

Problem

Research questions and friction points this paper is trying to address.

Develop normalized stochastic methods with momentum for unconstrained optimization

Analyze complexity under heavy-tailed noise and weak smoothness

Improve or match best-known complexity bounds in literature

Innovation

Methods, ideas, or system contributions that make the work stand out.

Normalized stochastic methods with Polyak momentum

Dynamically updated algorithmic parameters

Improved complexity under heavy-tailed noise

🔎 Similar Papers

Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods