CEMR: An Effective Subgraph Matching Algorithm with Redundant Extension Elimination

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Subgraph matching is an NP-hard problem, and existing approaches suffer from inefficiency due to extensive redundant computations. This work proposes the CEMR algorithm, which integrates a redundancy-aware expansion elimination mechanism within a depth-first search framework to significantly enhance matching efficiency. The core innovations include a strategy for merging and reusing common extensions, combined with black-white vertex encoding and a common extension buffer to minimize repeated calculations. Additionally, two effective pruning strategies are introduced to eliminate invalid search branches early. Experimental results demonstrate that CEMR substantially outperforms state-of-the-art subgraph matching algorithms across various real-world graph datasets and query workloads.

Technology Category

Application Category

📝 Abstract

Subgraph matching is a fundamental problem in graph analysis with a wide range of applications. However, due to its inherent NP-hardness, enumerating subgraph matches efficiently on large real-world graphs remains highly challenging. Most existing works adopt a depth-first search (DFS) backtracking strategy, where a partial embedding is gradually extended in a DFS manner along a branch of the search trees until either a full embedding is found or no further extension is possible. A major limitation of this paradigm is the significant amount of duplicate computation that occurs during enumeration, which increases the overall runtime. To overcome this limitation, we propose a novel subgraph matching algorithm, CEMR. It incorporates two techniques to reduce duplicate extensions: common extension merging, which leverages a black-white vertex encoding, and common extension reusing, which employs common extension buffers. In addition, we design two pruning techniques to discard unpromising search branches. Extensive experiments on real-world datasets and diverse query workloads demonstrate that CEMR outperforms state-of-the-art subgraph matching methods.

Problem

Research questions and friction points this paper is trying to address.

subgraph matching

NP-hardness

duplicate computation

graph analysis

enumeration

Innovation

Methods, ideas, or system contributions that make the work stand out.

subgraph matching

redundant extension elimination

common extension merging