🤖 AI Summary
Trajectory and policy optimization in robotics often suffer from non-differentiable objective functions and susceptibility to local optima. To address these challenges, this paper proposes a unified zeroth-order optimization framework centered on stochastic search. The framework subsumes multiple classical trajectory optimization methods under a common mathematical formalism and derives novel reinforcement learning algorithms that integrate finite-difference estimation, perturbation analysis, and other zeroth-order techniques—balancing robustness with differentiable approximations. We present the first systematic taxonomy of zeroth-order optimization paradigms in robotic control. Empirically, our algorithms match or exceed the performance of state-of-the-art first-order methods on standard benchmarks, while demonstrating superior adaptability to nonsmooth dynamics and sparse-reward settings. This work establishes a theoretically coherent and implementationally lightweight pathway for model-free robotic learning.
📝 Abstract
Zero-order optimization techniques are becoming increasingly popular in robotics due to their ability to handle non-differentiable functions and escape local minima. These advantages make them particularly useful for trajectory optimization and policy optimization. In this work, we propose a mathematical tutorial on random search. It offers a simple and unifying perspective for understanding a wide range of algorithms commonly used in robotics. Leveraging this viewpoint, we classify many trajectory optimization methods under a common framework and derive novel competitive RL algorithms.