🤖 AI Summary
To address the challenge of gradient-based methods getting trapped in local minima during global optimization of non-convex functions, this paper proposes a gradient-informed swarm particle optimization framework. It constructs a smooth approximation of the objective function via a Soft-min energy functional, integrates stochastic gradient flow with a time-dependent annealing mechanism (governed by parameter β), and enhances exploration through Brownian motion. Theoretically, under strong convexity assumptions, we prove that at least one particle converges to the global optimum with high probability, and the barrier-crossing rate surpasses that of simulated annealing. Leveraging stochastic differential equation modeling and hitting-time analysis in the small-noise limit, we uncover the synergistic interplay between gradient information and stochasticity in optimization. Experiments on benchmark problems—including double-well potentials and the Ackley function—demonstrate superior local-minimum escape capability and faster convergence, empirically validating the theoretical findings.
📝 Abstract
Global optimization, particularly for non-convex functions with multiple local minima, poses significant challenges for traditional gradient-based methods. While metaheuristic approaches offer empirical effectiveness, they often lack theoretical convergence guarantees and may disregard available gradient information. This paper introduces a novel gradient-based swarm particle optimization method designed to efficiently escape local minima and locate global optima. Our approach leverages a "Soft-min Energy" interacting function, $J_β(mathbf{x})$, which provides a smooth, differentiable approximation of the minimum function value within a particle swarm. We define a stochastic gradient flow in the particle space, incorporating a Brownian motion term for exploration and a time-dependent parameter $β$ to control smoothness, similar to temperature annealing. We theoretically demonstrate that for strongly convex functions, our dynamics converges to a stationary point where at least one particle reaches the global minimum, with other particles exhibiting exploratory behavior. Furthermore, we show that our method facilitates faster transitions between local minima by reducing effective potential barriers with respect to Simulated Annealing. More specifically, we estimate the hitting times of unexplored potential wells for our model in the small noise regime and show that they compare favorably with the ones of overdamped Langevin. Numerical experiments on benchmark functions, including double wells and the Ackley function, validate our theoretical findings and demonstrate better performance over the well-known Simulated Annealing method in terms of escaping local minima and achieving faster convergence.