🤖 AI Summary
Graph matching is formulated as the Quadratic Assignment Problem (QAP), whose NP-hardness causes existing projection-based relaxation methods to suffer from excessive feasible set expansion, leading to numerical scale sensitivity and geometric inconsistency. To address this, we propose a novel Frobenius-regularized relaxation framework: a tunable Frobenius norm regularization term is introduced to suppress convex hull expansion, and the projection step is reformulated as a regularized linear assignment problem, solved via Scaling Doubly Stochastic Normalization (SDSN). We further design a mixed-precision solving architecture—leveraging CPU-GPU collaboration and adaptive FP16/FP32/FP64 arithmetic—to balance efficiency and accuracy. Theoretical analysis guarantees both geometric consistency and numerical robustness. Experiments demonstrate that our method outperforms state-of-the-art algorithms across all benchmarks on CPU; on GPU, it achieves up to 370× speedup with negligible solution accuracy degradation.
📝 Abstract
Graph matching, typically formulated as a Quadratic Assignment Problem (QAP), seeks to establish node correspondences between two graphs. To address the NP-hardness of QAP, some existing methods adopt projection-based relaxations that embed the problem into the convex hull of the discrete domain. However, these relaxations inevitably enlarge the feasible set, introducing two sources of error: numerical scale sensitivity and geometric misalignment between the relaxed and original domains. To alleviate these errors, we propose a novel relaxation framework by reformulating the projection step as a Frobenius-regularized Linear Assignment (FRA) problem, where a tunable regularization term mitigates feasible region inflation. This formulation enables normalization-based operations to preserve numerical scale invariance without compromising accuracy. To efficiently solve FRA, we propose the Scaling Doubly Stochastic Normalization (SDSN) algorithm. Building on its favorable computational properties, we develop a theoretically grounded mixed-precision architecture to achieve substantial acceleration. Comprehensive CPU-based benchmarks demonstrate that FRAM consistently outperforms all baseline methods under identical precision settings. When combined with a GPU-based mixed-precision architecture, FRAM achieves up to 370X speedup over its CPU-FP64 counterpart, with negligible loss in solution accuracy.