🤖 AI Summary
To address the slow convergence and suboptimal solutions of local optimization methods in real-time control—stemming from reliance on a single, fixed initial solution—this paper proposes a learning-based framework for multi-initial-solution prediction. Methodologically, it formulates diverse initial-solution generation as a supervised learning task for the first time and incorporates meta-learning to enhance cross-task generalization. Two complementary execution strategies are introduced: (i) adaptive selection of a single optimizer and (ii) parallel execution of multiple optimizers—both rigorously guaranteeing that the final solution is no worse than that obtained from default initialization. The framework is compatible with various optimal control optimizers, including DDP, MPPI, and iLQR. Evaluated on cart-pole, reacher, and autonomous driving benchmarks, it significantly improves both convergence speed and solution quality under strict time constraints, while scaling efficiently to larger numbers of initial solutions.
📝 Abstract
Sequentially solving similar optimization problems under strict runtime constraints is essential for many applications, such as robot control, autonomous driving, and portfolio management. The performance of local optimization methods in these settings is sensitive to the initial solution: poor initialization can lead to slow convergence or suboptimal solutions. To address this challenge, we propose learning to predict emph{multiple} diverse initial solutions given parameters that define the problem instance. We introduce two strategies for utilizing multiple initial solutions: (i) a single-optimizer approach, where the most promising initial solution is chosen using a selection function, and (ii) a multiple-optimizers approach, where several optimizers, potentially run in parallel, are each initialized with a different solution, with the best solution chosen afterward. Notably, by including a default initialization among predicted ones, the cost of the final output is guaranteed to be equal or lower than with the default initialization. We validate our method on three optimal control benchmark tasks: cart-pole, reacher, and autonomous driving, using different optimizers: DDP, MPPI, and iLQR. We find significant and consistent improvement with our method across all evaluation settings and demonstrate that it efficiently scales with the number of initial solutions required. The code is available at MISO (https://github.com/EladSharony/miso).