๐ค AI Summary
This work analyzes the linear convergence rate of the (1+1)-Evolution Strategy (ES) on locally $L$-strongly convex functions with $U$-Lipschitz continuous gradients. Unlike existing derivative-free optimization theoryโwhich relies on prior knowledge of problem parameters (e.g., $L$, $U$)โthis paper establishes the first tight exponential convergence bounds *without any parameter information*: an upper bound of $exp(-Omega(L/(d cdot U)))$ and a matching lower bound of $exp(-1/d)$. Methodologically, we integrate a stochastic adaptive step-size mechanism, probabilistic convergence analysis, and asymptotic reasoning as dimension $d o infty$, thereby explicitly characterizing how the convergence rate depends on both dimension $d$ and condition number $L/U$. This result breaks the reliance of classical black-box optimization theory on strong prior assumptions, providing the first parameter-free, dimensionally explicit theoretical guarantee for the (1+1)-ES.
๐ Abstract
Evolution strategy (ES) is one of the promising classes of algorithms for black-box continuous optimization. Despite its broad successes in applications, theoretical analysis on the speed of its convergence is limited on convex quadratic functions and their monotonic transformation. In this study, an upper bound and a lower bound of the rate of linear convergence of the (1+1)-ES on locally <inline-formula> <tex-math notation="LaTeX">$L$ </tex-math></inline-formula>-strongly convex functions with <inline-formula> <tex-math notation="LaTeX">$U$ </tex-math></inline-formula>-Lipschitz continuous gradient are derived as <inline-formula> <tex-math notation="LaTeX">$exp (-Omega _{d o infty }({L}/({dcdot U})))$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$exp (-1/d)$ </tex-math></inline-formula>, respectively. Notably, any prior knowledge on the mathematical properties of the objective function, such as the Lipschitz constant, is not given to the algorithm, whereas the existing analyses of derivative-free optimization algorithms require it.