Adam-SHANG: A Convergent Adam-Type Method for Stochastic Smooth Convex Optimization

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

This work addresses the lack of theoretical convergence guarantees for Adam-type methods in stochastic smooth convex optimization by proposing Adam-SHANG, which integrates momentum, adaptive preconditioning, and curvature-aware correction with a more stable delayed preconditioner update mechanism. The method innovatively establishes the first proof of expected convergence without relying on the global monotonicity of second-moment sequences. Additionally, it introduces a computable trace-ratio stepsize rule based on local coordinate alignment. Through a Lyapunov analysis framework combined with an adaptive learning rate–momentum coupling technique, the approach provides rigorous convergence guarantees. Empirical results demonstrate effective decay of stochastic errors and show that Adam-SHANG achieves training performance comparable to or better than Adam and AdamW on deep learning tasks.

📝 Abstract

We propose Adam-SHANG, a Lyapunov-guided Adam-type method that couples momentum, adaptive preconditioning, and a curvature-aware correction through a more stable lagged-preconditioner update. For stochastic smooth convex optimization, we prove convergence in expectation under an admissible stepsize condition that can always be satisfied by a conservative spectral bound, without imposing global monotonicity on the second-moment sequence. To obtain a less conservative practical rule, we introduce a computable trace-ratio stepsize, motivated by a local coordinatewise alignment condition. The same structural update is also tested beyond the convex setting with simplified parameters. Experiments validate the predicted stochastic decay and show competitive training performance against Adam and AdamW on deep learning tasks.

Problem

Research questions and friction points this paper is trying to address.

stochastic optimization

convex optimization

Adam-type methods

convergence

adaptive preconditioning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lyapunov-guided optimization

adaptive preconditioning

lagged-preconditioner update