CEMR: An Effective Subgraph Matching Algorithm with Redundant Extension Elimination

📅 2026-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Subgraph matching is an NP-hard problem, and existing approaches suffer from inefficiency due to extensive redundant computations. This work proposes the CEMR algorithm, which integrates a redundancy-aware expansion elimination mechanism within a depth-first search framework to significantly enhance matching efficiency. The core innovations include a strategy for merging and reusing common extensions, combined with black-white vertex encoding and a common extension buffer to minimize repeated calculations. Additionally, two effective pruning strategies are introduced to eliminate invalid search branches early. Experimental results demonstrate that CEMR substantially outperforms state-of-the-art subgraph matching algorithms across various real-world graph datasets and query workloads.

Technology Category

Application Category

📝 Abstract
Subgraph matching is a fundamental problem in graph analysis with a wide range of applications. However, due to its inherent NP-hardness, enumerating subgraph matches efficiently on large real-world graphs remains highly challenging. Most existing works adopt a depth-first search (DFS) backtracking strategy, where a partial embedding is gradually extended in a DFS manner along a branch of the search trees until either a full embedding is found or no further extension is possible. A major limitation of this paradigm is the significant amount of duplicate computation that occurs during enumeration, which increases the overall runtime. To overcome this limitation, we propose a novel subgraph matching algorithm, CEMR. It incorporates two techniques to reduce duplicate extensions: common extension merging, which leverages a black-white vertex encoding, and common extension reusing, which employs common extension buffers. In addition, we design two pruning techniques to discard unpromising search branches. Extensive experiments on real-world datasets and diverse query workloads demonstrate that CEMR outperforms state-of-the-art subgraph matching methods.
Problem

Research questions and friction points this paper is trying to address.

subgraph matching
NP-hardness
duplicate computation
graph analysis
enumeration
Innovation

Methods, ideas, or system contributions that make the work stand out.

subgraph matching
redundant extension elimination
common extension merging
common extension reusing
pruning techniques
🔎 Similar Papers
No similar papers found.
L
Linglin Yang
Peking University
X
Xunbin Su
Peking University
Lei Zou
Lei Zou
Professor at Peking University
graph databaseRDF data managementKnowledge Graph
Xiangyang Gou
Xiangyang Gou
University of New South Wales
Y
Yinnian Lin
Peking University