🤖 AI Summary
This work addresses the limited generalization of existing large language model (LLM)-based approaches to automatic heuristic discovery, which rely on static evaluation and are prone to overfitting under distributional shift. To overcome this, we propose the Algorithm Space Response Oracles (ASRO) framework, which uniquely integrates game theory with LLM-driven heuristic search by modeling heuristic discovery as a program-level co-evolution between a solver and an instance generator. Within a zero-sum game formulation, ASRO employs the LLM as an optimal-response oracle and dynamically constructs adversarial training curricula through mixed-strategy iterations over evolving strategy pools for both players. Evaluated across multiple combinatorial optimization tasks, ASRO substantially outperforms static-training baselines, demonstrating superior generalization and robustness both in-distribution and out-of-distribution.
📝 Abstract
Large language models (LLMs) have enabled rapid progress in automatic heuristic discovery (AHD), yet most existing methods are predominantly limited by static evaluation against fixed instance distributions, leading to potential overfitting and poor generalization under distributional shifts. We propose Algorithm Space Response Oracles (ASRO), a game-theoretic framework that reframes heuristic discovery as a program level co-evolution between solver and instance generator. ASRO models their interaction as a two-player zero-sum game, maintains growing strategy pools on both sides, and iteratively expands them via LLM-based best-response oracles against mixed opponent meta-strategies, thereby replacing static evaluation with an adaptive, self-generated curriculum. Across multiple combinatorial optimization domains, ASRO consistently outperforms static-training AHD baselines built on the same program search mechanisms, achieving substantially improved generalization and robustness on diverse and out-of-distribution instances.