🤖 AI Summary
This work addresses the limitations of offline model-based optimization, which often relies on regression modeling and suffers from distributional mismatch between training data and near-optimal designs. From a learnability perspective, the authors reformulate the optimization problem as a ranking task that distinguishes high-performing from suboptimal designs, proposing a ranking-centric, optimization-oriented risk framework. The framework identifies distributional mismatch as the primary source of error and theoretically demonstrates the superiority of ranking over conventional regression approaches. By integrating an optimization-aware ranking risk, a unified learning theory, and a distribution-aware algorithm, the method significantly outperforms 20 existing baselines across diverse tasks, validating the efficacy of the ranking perspective and exposing the fundamental limitation of overly optimistic extrapolation in offline optimization.
📝 Abstract
Offline model-based optimization (MBO) seeks to discover high-performing designs using only a fixed dataset of past evaluations. Most existing methods rely on learning a surrogate model via regression and implicitly assume that good predictive accuracy leads to good optimization performance. In this work, we challenge this assumption and study offline MBO from a learnability perspective. We argue that offline optimization is fundamentally a problem of ranking high-quality designs rather than accurate value prediction. Specifically, we introduce an optimization-oriented risk based on ranking between near-optimal and suboptimal designs, and develop a unified theoretical framework that connects surrogate learning to final optimization. We prove the theoretical advantages of ranking over regression, and identify distributional mismatch between the training data and near-optimal designs as the dominant error. Inspired by this, we design a distribution-aware ranking method to reduce this mismatch. Empirical results across various tasks show that our approach outperforms twenty existing methods, validating our theoretical findings. Additionally, both theoretical and empirical results reveal intrinsic limitations in offline MBO, showing a regime in which no offline method can avoid over-optimistic extrapolation.