S3AND: Efficient Subgraph Similarity Search Under Aggregated Neighbor Difference Semantics (Technical Report)

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This paper addresses subgraph similarity search over large-scale graphs under joint keyword and structural constraints. We propose S³AND, a novel paradigm that (i) introduces the first subgraph similarity measure grounded in *aggregated neighbor difference semantics*, balancing semantic expressiveness and computational efficiency; (ii) designs a dual pruning strategy—keyword-set pruning and lower-bound pruning derived from the new semantics; and (iii) constructs a lightweight, dedicated index to accelerate matching. Extensive experiments on real-world and synthetic graphs demonstrate that S³AND reduces query latency by 2–3 orders of magnitude over state-of-the-art baselines while maintaining >98% recall, enabling real-time response on graphs with up to ten million edges. Our core contributions are: (1) a novel semantics-aware similarity measure; (2) efficient, semantics-driven pruning mechanisms; and (3) an index architecture co-designed for the proposed semantics.

Technology Category

Application Category

📝 Abstract

For the past decades, the extit{subgraph similarity search} over a large-scale data graph has become increasingly important and crucial in many real-world applications, such as social network analysis, bioinformatics network analytics, knowledge graph discovery, and many others. While previous works on subgraph similarity search used various graph similarity metrics such as the graph isomorphism, graph edit distance, and so on, in this paper, we propose a novel problem, namely extit{subgraph similarity search under aggregated neighbor difference semantics} (S$^3$AND), which identifies subgraphs $g$ in a data graph $G$ that are similar to a given query graph $q$ by considering both keywords and graph structures (under new keyword/structural matching semantics). To efficiently tackle the S$^3$AND problem, we design two effective pruning methods, extit{keyword set} and extit{aggregated neighbor difference lower bound pruning}, which rule out false alarms of candidate vertices/subgraphs to reduce the S$^3$AND search space. Furthermore, we construct an effective indexing mechanism to facilitate our proposed efficient S$^3$AND query answering algorithm. Through extensive experiments, we demonstrate the effectiveness and efficiency of our S$^3$AND approach over both real and synthetic graphs under various parameter settings.

Problem

Research questions and friction points this paper is trying to address.

Efficient subgraph similarity search in large graphs

Novel aggregated neighbor difference semantics

Pruning methods to reduce search space

Innovation

Methods, ideas, or system contributions that make the work stand out.

Keyword and structural matching semantics

Pruning methods reduce search space

Effective indexing for query answering

🔎 Similar Papers

Efficient Exact Subgraph Matching via GNN-based Path Dominance Embedding