๐ค AI Summary
Existing stochastic difference-of-convex (DC) optimization methods lack convergence guarantees under mini-batch settings, typically requiring either large batch sizes or strong noise assumptions.
Method: This paper proposes a novel momentum-based stochastic DC optimization framework, designed under standard smoothness and bounded gradient variance assumptions. It introduces momentum-driven gradient updates to mitigate variance amplification inherent in small-batch sampling.
Contribution/Results: We establish the first provably convergent algorithm for stochastic DC optimization that accommodates arbitrary batch sizesโwithout relying on large batches or restrictive noise conditions. Theoretical analysis demonstrates that momentum effectively suppresses mini-batch-induced gradient variance, ensuring convergence to a stationary point. Empirical evaluations confirm substantial improvements in both convergence speed and stability over state-of-the-art baselines. This work provides the first general, robust, and theoretically grounded momentum framework for mini-batch stochastic DC optimization.
๐ Abstract
Stochastic difference-of-convex (DC) optimization is prevalent in numerous machine learning applications, yet its convergence properties under small batch sizes remain poorly understood. Existing methods typically require large batches or strong noise assumptions, which limit their practical use. In this work, we show that momentum enables convergence under standard smoothness and bounded variance assumptions (of the concave part) for any batch size. We prove that without momentum, convergence may fail regardless of stepsize, highlighting its necessity. Our momentum-based algorithm achieves provable convergence and demonstrates strong empirical performance.