๐ค AI Summary
Existing top-k diversified subgraph matching methods prioritize vertex coverage but often yield locally clustered results with insufficient topological diversity. This paper formalizes the Distance-Diversified Top-k Subgraph Matching (DTkSM) problem: selecting k pairwise isomorphic matches that maximize topological distance to enhance global structural coverage. To solve it, we propose a partitioned adjacency graph modeling frameworkโthe first to incorporate pairwise topological distance as a core diversification criterion. Our approach integrates embedding-driven partition filtering, densest-subgraph-first candidate selection, and distance-diversity optimization, accelerated by efficient subgraph isomorphism checking and partition indexing. Evaluated on 12 real-world datasets, our method achieves up to four orders-of-magnitude speedup over state-of-the-art baselines; 95% of its outputs attain at least 80% of the optimal distance diversity, significantly improving both topological and coverage diversity.
๐ Abstract
Subgraph matching is a core task in graph analytics, widely used in domains such as biology, finance, and social networks. Existing top-k diversified methods typically focus on maximizing vertex coverage, but often return results in the same region, limiting topological diversity. We propose the Distance-Diversified Top-k Subgraph Matching (DTkSM) problem, which selects k isomorphic matches with maximal pairwise topological distances to better capture global graph structure. To address its computational challenges, we introduce the Partition-based Distance Diversity (PDD) framework, which partitions the graph and retrieves diverse matches from distant regions. To enhance efficiency, we develop two optimizations: embedding-driven partition filtering and densest-based partition selection over a Partition Adjacency Graph. Experiments on 12 real world datasets show our approach achieves up to four orders of magnitude speedup over baselines, with 95% of results reaching 80% of optimal distance diversity and 100% coverage diversity.