Zeroth-Order Nonconvex Nonsmooth Optimization with Heavy-Tailed Noise

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work addresses non-convex, non-smooth optimization problems with Lipschitz continuous objective functions, where only noisy function evaluations corrupted by heavy-tailed noise are accessible. The authors propose a novel stochastic zeroth-order algorithm that integrates a clipped two-point gradient estimator with an online-to-non-convex conversion framework, achieving robustness against heavy-tailed noise while relying solely on function value queries. As the first method to handle such settings in the zeroth-order regime, it attains a query complexity of $\mathcal{O}(d^{\frac{p}{2(p-1)}}\delta^{-1}\varepsilon^{-\frac{2p-1}{p-1}})$, matching the best-known dependence on both dimension $d$ and accuracy $\varepsilon$, and demonstrates strong empirical performance in numerical experiments.

📝 Abstract

This paper considers the nonconvex nonsmooth problem in which the objective function is Lipschitz continuous. We focus on the stochastic setting where the algorithm can access stochastic function value evaluations with heavy-tailed noise, which is prevalent in many popular machine learning applications. We propose a stochastic zeroth-order algorithm that refines the framework of online-to-nonconvex conversion by clipping the two-point gradient estimator. The theoretical analysis shows that our algorithm can find a $(δ, ε)$-Goldstein stationary point with zeroth-order oracle complexity of ${\mathcal O}(d^{\frac{p}{2(p-1)}}δ^{-1}ε^{-\frac{2p-1}{p-1}})$, where $d$ is the problem dimension and $p\in(1,2]$ is the order of bounded moments. Note that our dependence on dimension $d$ matches the best-known results of stochastic zeroth-order optimization for finding the sub-optimal solution of a stochastic convex nonsmooth problem. In addition, our dependence on accuracy parameters $δ$ and $ε$ is consistent with that of the best-known stochastic first-order algorithms for stochastic nonconvex nonsmooth problems. Finally, we conduct numerical experiments to demonstrate the effectiveness of the proposed method.

Problem

Research questions and friction points this paper is trying to address.

zeroth-order optimization

nonconvex nonsmooth optimization

heavy-tailed noise

stochastic optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

zeroth-order optimization

heavy-tailed noise

nonconvex nonsmooth optimization

gradient estimator clipping