🤖 AI Summary
To address poor scalability of conventional methods and high estimation error in parameterized gradient approximation for high-dimensional, nonlinear black-box optimization, this paper proposes OHGL—a framework integrating Optimistic Gradient Learning (OGL) with second-order Hessian correction, enabling efficient optimization without access to true gradients or analytic function structure. Its key innovation is the first-of-its-kind synergistic mechanism combining optimism-guided bias mitigation with Hessian-based correction, substantially enhancing robustness and accuracy of gradient estimation in high dimensions. OHGL unifies explicit gradient learning, first-order Taylor approximation, second-order correction, and neural parameterization. Evaluated on the COCO benchmark, it achieves state-of-the-art performance. Moreover, it demonstrates strong practical efficacy in real-world high-dimensional machine learning tasks—including adversarial training and code generation—producing superior candidate solutions.
📝 Abstract
Black-box algorithms are designed to optimize functions without relying on their underlying analytical structure or gradient information, making them essential when gradients are inaccessible or difficult to compute. Traditional methods for solving black-box optimization (BBO) problems predominantly rely on non-parametric models and struggle to scale to large input spaces. Conversely, parametric methods that model the function with neural estimators and obtain gradient signals via backpropagation may suffer from significant gradient errors. A recent alternative, Explicit Gradient Learning (EGL), which directly learns the gradient using a first-order Taylor approximation, has demonstrated superior performance over both parametric and non-parametric methods. In this work, we propose two novel gradient learning variants to address the robustness challenges posed by high-dimensional, complex, and highly non-linear problems. Optimistic Gradient Learning (OGL) introduces a bias toward lower regions in the function landscape, while Higher-order Gradient Learning (HGL) incorporates second-order Taylor corrections to improve gradient accuracy. We combine these approaches into the unified OHGL algorithm, achieving state-of-the-art (SOTA) performance on the synthetic COCO suite. Additionally, we demonstrate OHGLs applicability to high-dimensional real-world machine learning (ML) tasks such as adversarial training and code generation. Our results highlight OHGLs ability to generate stronger candidates, offering a valuable tool for ML researchers and practitioners tackling high-dimensional, non-linear optimization challenges