🤖 AI Summary
This work addresses the poor generalization of graph combinatorial optimization methods across diverse instances and the sharp decline in efficiency of existing reinforcement learning approaches as action spaces grow. The authors propose a projection agent framework that jointly embeds observations and actions into a unified representation space. By leveraging graph neural networks, the method constructs a continuous latent action space and employs nearest-neighbor decoding to efficiently generate valid discrete actions, enabling decision-making in a single forward pass. This approach accommodates super-linear, multivariate-coupled action spaces, substantially enhancing both generalization and inference speed. Experiments demonstrate up to a 16.2× acceleration in inference and a 40% improvement in generalization performance across multiple benchmarks. The authors also release LaGCO-RL, an open-source library supporting automated action space construction.
📝 Abstract
Graph combinatorial optimization (GCO) has attracted growing interest, as many NP-hard problems naturally admit graph formulations, yet their combinatorial explosion renders exact methods computationally intractable. Recent advances in Reinforcement Learning (RL) combined with Graph Neural Networks (GNNs) have significantly improved learning-based GCO solvers. However, existing approaches face limitations in both generalization across diverse graph instances and computational scalability as action spaces grow. To address both challenges, we introduce projection agents, a novel RL-GCO approach that operates directly in a continuous GNN-based action embedding space, predicting a desired latent action in a single forward pass and subsequently decoding it into a valid discrete action. Additionally, we enable fair comparison across RL methods through a shared embedding space for both observations and actions. Across diverse benchmarks, our approach achieves up to 16.2x faster inference and up to 40% better generalization than existing solutions using only simple nearest-neighbor decoding, while opening the door to strong RL performance in super-linear decision spaces with multiple interdependent variables. Finally, we release LaGCO-RL, a Python library that automates latent action-space construction and supports existing RL-GCO solutions, promoting reproducibility and adaptation to new GCO benchmarks.