🤖 AI Summary
Expensive multi-objective optimization (EMO) suffers from high evaluation costs, low sample efficiency, and poor generalization across diverse problems. Existing approaches either require retraining Gaussian process (GP) surrogates for each new problem or rely on large-scale historical datasets to pretrain deep models—both limiting adaptability to emerging tasks. This paper introduces FoMEMO, the first foundation model paradigm for EMO: a deep neural network pretrained on hundreds of millions of synthetic GP-generated optimization trajectories. FoMEMO enables zero-shot transfer and preference-conditioned in-context learning—requiring neither fine-tuning nor model reconstruction—to rapidly and efficiently approximate Pareto fronts for arbitrary new problems under user-specified preferences. Evaluated across diverse synthetic benchmarks and real-world applications, FoMEMO consistently outperforms state-of-the-art methods, achieving superior generalization and significantly improved sample efficiency.
📝 Abstract
Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregation posteriors. Rather than accessing extensive domain experiments in the real world, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior adaptability to unknown problems, without necessitating any subsequent model training or updates in the optimization process. We evaluate our method across a variety of synthetic benchmarks and real-word applications, and demonstrate its superior generality and competitive performance compared to existing methods.