🤖 AI Summary
Neural TSP solvers suffer from poor generalization and high training costs. To address these challenges, we propose Rescaling Graph Convolutional Network (RsGCN), which introduces: (i) a novel scale-adaptive neighborhood rescaling mechanism that decouples scale-dependent features in GCNs; (ii) subgraph edge-length normalization to stabilize message aggregation; (iii) hybrid-scale data augmentation and bidirectional loss for efficient training; and (iv) Re2Opt, a post-search algorithm that adaptively reconstructs solution weights to escape local optima. Remarkably, RsGCN achieves zero-shot generalization to 10K-node instances after only three training epochs (on up to 100-node graphs). It establishes new state-of-the-art performance on uniform Euclidean instances with 20–10K nodes and on 78 TSPLIB benchmarks, while using the fewest parameters and training epochs among all existing neural TSP solvers.
📝 Abstract
Neural traveling salesman problem (TSP) solvers face two critical challenges: poor generalization for scalable TSPs and high training costs. To address these challenges, we propose a new Rescaling Graph Convolutional Network (RsGCN). Focusing on the scale-dependent features (i.e., features varied with problem scales) related to nodes and edges that influence the sensitivity of GCNs to the problem scales, a Rescaling Mechanism in RsGCN enhances the generalization capability by (1) rescaling adjacent nodes to construct a subgraph with a uniform number of adjacent nodes for each node across various scales of TSPs, which stabilizes the graph message aggregation; (2) rescaling subgraph edges to adjust the lengths of subgraph edges to the same magnitude, which maintains numerical consistency. In addition, an efficient training strategy with a mixed-scale dataset and bidirectional loss is used in RsGCN. To fully exploit the heatmaps generated by RsGCN, we design an efficient post-search algorithm termed Re2Opt, in which a reconstruction process based on adaptive weight is incorporated to help avoid local optima. Based on a combined architecture of RsGCN and Re2Opt, our solver achieves remarkable generalization and low training cost: with only 3 epochs of training on the mixed-scale dataset containing instances with up to 100 nodes, it can be generalized successfully to 10K-node instances without any fine-tuning. Extensive experiments demonstrate our state-of-the-art performance across uniform distribution instances of 9 different scales from 20 to 10K nodes and 78 real-world instances from TSPLIB, while requiring the fewest learnable parameters and training epochs among neural competitors.