Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses the challenge of convergence in constrained non-convex optimization under heavy-tailed noise by proposing a class of proximal preconditioned stochastic gradient algorithms that extend the Muon and Scion optimizers to accommodate a wide range of convex and non-convex constraints. The key innovations include the first integration of a proximal mechanism into spectral gradient methods for constraint handling, the development of a more realistic nonlinear preconditioning convergence analysis, and the design of a variance-reduced variant to accelerate convergence. Theoretically, the method is guaranteed to converge under both standard and heavy-tailed noise assumptions, with the variance-reduced version substantially improving convergence rates. The analysis provides a more accurate characterization of practical optimization dynamics compared to existing approaches.

📝 Abstract

In this work, we develop proximal preconditioned gradient methods with a focus on spectral gradient methods providing a proximal extension to the Muon and Scion optimizers. We introduce a family of stochastic algorithms that can handle a wide variety of convex and nonconvex constraints and study its convergence under heavy-tailed noise, through a novel analysis tailored to the geometry of the proposed methods. We further propose a variance-reduced version, which achieves faster convergence under standard noise assumptions. Finally, we show that the polynomial iterations used in Muon are more accurately captured by a nonlinear preconditioner than by the ideal matrix sign, leading to a convergence analysis that more faithfully reflects practical implementations.

Problem

Research questions and friction points this paper is trying to address.

nonconvex optimization

stochastic gradient methods

spectral preconditioning

heavy-tailed noise

constrained optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

proximal preconditioning

spectral gradient methods

nonconvex constraints