🤖 AI Summary
Expensive black-box multi-task multi-objective optimization (MTMOO) suffers from weak generalization and cumbersome surrogate modeling. Method: This paper proposes Q-MetaSur, an offline language-driven surrogate model, which reformulates MTMOO as a text-based sequence-to-sequence generation task and employs large language models (LLMs) as unified surrogates. It integrates implicit Q-learning with offline reinforcement learning, establishing a two-stage training paradigm comprising supervised fine-tuning followed by RL-based refinement. Contribution/Results: Q-MetaSur constitutes the first plug-and-play, task-agnostic meta-surrogate learning framework. Evaluated on the CEC2019 benchmark, it significantly improves objective approximation accuracy, accelerates convergence of underlying evolutionary algorithms, and yields superior Pareto fronts—outperforming state-of-the-art surrogate models across all metrics.
📝 Abstract
Data-driven evolutionary algorithms has shown surprising results in addressing expensive optimization problems through robust surrogate modeling. Though promising, existing surrogate modeling schemes may encounter limitations in complex optimization problems with many sub-objectives, which rely on repeated and tedious approximation. To address such technical gap, we propose Q-MetaSur as a plug-and-play surrogate modeling scheme capable of providing unified and generalized surrogate learning. Specifically, we consider multi-task-multi-objective optimization~(MTMOO) in offline setting. Several key designs are proposed: 1) we transform objective approximation into sequence-to-sequence modeling where MTMOO problem can be represented by tenxual tokenization. To operate under such auto-regressive modeling, we introduce a Large Language Model-based surrogate model that first encodes a MTMOO instance and then decodes objective values of unseen decision variables. To ensure stability in training the proposed model, we propose a two-stage offline training strategy that operates as a synergy of supervised tuning and RL fine-tuning, which first exploits offline dataset to fit existing knowledge and then leverages RL to enhance model's generalization performance. Extensive empirical results on the CEC2019 benchmark demonstrate that Q-MetaSur not only outperforms representative surrogate baselines in objective approximation accuracy, but also helps underlying evolutionary algorithms achieve both desired optimization convergence and improved pareto optimality.