UniCO: Towards a Unified Model for Combinatorial Optimization Problems

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing combinatorial optimization (CO) solvers suffer from poor generalization across diverse problem instances and high retraining costs. Method: We propose UniCO—the first unified CO solver—based on a single Transformer architecture and shared parameter set. It formalizes heterogeneous CO problems as Markov decision processes (MDPs), employs trajectory tokenization, and introduces a novel CO-prefix encoding to aggregate static problem-structure features. A two-stage self-supervised pretraining framework decouples dynamic state prediction from policy generation, enabling cross-problem knowledge transfer. Contribution/Results: Evaluated on ten canonical CO tasks, UniCO achieves efficient zero-shot or few-shot adaptation to unseen problems, eliminating the need for task-specific architectures or extensive fine-tuning. It significantly improves model generality, reduces deployment overhead, and establishes a new paradigm for scalable, reusable CO modeling.

Technology Category

Application Category

📝 Abstract
Combinatorial Optimization (CO) encompasses a wide range of problems that arise in many real-world scenarios. While significant progress has been made in developing learning-based methods for specialized CO problems, a unified model with a single architecture and parameter set for diverse CO problems remains elusive. Such a model would offer substantial advantages in terms of efficiency and convenience. In this paper, we introduce UniCO, a unified model for solving various CO problems. Inspired by the success of next-token prediction, we frame each problem-solving process as a Markov Decision Process (MDP), tokenize the corresponding sequential trajectory data, and train the model using a transformer backbone. To reduce token length in the trajectory data, we propose a CO-prefix design that aggregates static problem features. To address the heterogeneity of state and action tokens within the MDP, we employ a two-stage self-supervised learning approach. In this approach, a dynamic prediction model is first trained and then serves as a pre-trained model for subsequent policy generation. Experiments across 10 CO problems showcase the versatility of UniCO, emphasizing its ability to generalize to new, unseen problems with minimal fine-tuning, achieving even few-shot or zero-shot performance. Our framework offers a valuable complement to existing neural CO methods that focus on optimizing performance for individual problems.
Problem

Research questions and friction points this paper is trying to address.

Develop unified model for diverse combinatorial optimization problems
Address heterogeneity in state and action tokens via self-supervised learning
Achieve generalization to unseen problems with minimal fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses transformer backbone for unified CO model
Implements CO-prefix to reduce token length
Employs two-stage self-supervised learning approach
🔎 Similar Papers
No similar papers found.