🤖 AI Summary
Neural combinatorial optimization suffers from insufficient exploration of the solution space and overreliance on hand-crafted rules to enforce diversity. Method: This paper proposes PolyNet—a novel framework that achieves implicit diversity modeling without explicit constraints for the first time. PolyNet jointly learns multiple orthogonal policy representations via a shared single decoder, trained end-to-end with reinforcement learning and regularized by an orthogonality constraint on policy embeddings. Contribution/Results: By eliminating explicit diversity mechanisms, PolyNet significantly enhances search diversity while preserving solution quality. Evaluated on four canonical combinatorial optimization tasks—including TSP and VRP—PolyNet consistently outperforms explicit diversity baselines by 1.2%–3.7% in solution quality, narrowing the performance gap between neural solvers and manually designed algorithms. The framework establishes a new paradigm for scalable, high-quality, and adaptive neural combinatorial optimization.
📝 Abstract
Reinforcement learning-based methods for constructing solutions to combinatorial optimization problems are rapidly approaching the performance of human-designed algorithms. To further narrow the gap, learning-based approaches must efficiently explore the solution space during the search process. Recent approaches artificially increase exploration by enforcing diverse solution generation through handcrafted rules, however, these rules can impair solution quality and are difficult to design for more complex problems. In this paper, we introduce PolyNet, an approach for improving exploration of the solution space by learning complementary solution strategies. In contrast to other works, PolyNet uses only a single-decoder and a training schema that does not enforce diverse solution generation through handcrafted rules. We evaluate PolyNet on four combinatorial optimization problems and observe that the implicit diversity mechanism allows PolyNet to find better solutions than approaches the explicitly enforce diverse solution generation.