DTC: Real-Time and Accurate Distributed Triangle Counting in Fully Dynamic Graph Streams

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-time triangle counting in fully dynamic graph streams—characterized by frequent edge insertions and deletions and absence of prior knowledge about graph size—remains highly challenging. Method: This paper proposes the Distributed Triangle Counting (DTC) algorithm family, the first to achieve unbiased approximation without requiring graph-size priors. DTC innovatively integrates randomized pair sampling with a future-edge compensation mechanism to uniformly handle both insertions and deletions; it further employs single-pass streaming processing, distributed hash partitioning, and multi-machine cooperative dynamic updates. Contribution/Results: DTC achieves linear scalability and low storage overhead (O(1) space complexity). Experiments show that DTC-AR improves estimation accuracy by 2029.4× over baseline methods, while DTC-FD reduces relative error by 32.5×. Both storage efficiency and scalability attain state-of-the-art performance.

Technology Category

Application Category

📝 Abstract
Triangle counting is a fundamental problem in graph mining, essential for analyzing graph streams with arbitrary edge orders. However, exact counting becomes impractical due to the massive size of real-world graph streams. To address this, approximate algorithms have been developed, but existing distributed streaming algorithms lack adaptability and struggle with edge deletions. In this article, we propose DTC, a novel family of single-pass distributed streaming algorithms for global and local triangle counting in fully dynamic graph streams. Our DTC-AR algorithm accurately estimates triangle counts without prior knowledge of graph size, leveraging multi-machine resources. Additionally, we introduce DTC-FD, an algorithm tailored for fully dynamic graph streams, incorporating edge insertions and deletions. Using Random Pairing and future edge insertion compensation, DTC-FD achieves unbiased and accurate approximations across multiple machines. Experimental results demonstrate significant improvements over baselines. DTC-AR achieves up to $2029.4 imes$ and $27.1 imes$ more accuracy, while maintaining the best trade-off between accuracy and storage space. DTC-FD reduces estimation errors by up to $32.5 imes$ and $19.3 imes$, scaling linearly with graph stream size. These findings highlight the effectiveness of our proposed algorithms in tackling triangle counting in real-world scenarios. The source code and datasets are released and available at href{https://github.com/wayne4s/srds-dtc.git}{https://github.com/wayne4s/srds-dtc.git}.
Problem

Research questions and friction points this paper is trying to address.

Accurately counting triangles in massive dynamic graph streams
Handling edge insertions and deletions in distributed systems
Overcoming limitations of existing approximate triangle counting algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed streaming algorithm for dynamic graphs
Accurate triangle counting without graph size knowledge
Unbiased approximation with edge insertion compensation
🔎 Similar Papers
No similar papers found.
W
Wei Xuan
Institute of Computing Technology, Chinese Academy of Sciences, China
Yan Liang
Yan Liang
Northwestern Polytechnical University
Information fusionState EstimationTarget tracking
H
Huawei Cao
Institute of Computing Technology, Chinese Academy of Sciences, China; Zhongguancun Laboratory, China
Ning Lin
Ning Lin
Princeton University
HurricanesStorm SurgeClimate AdaptationCoastal ResilienceRisk Analysis
X
Xiaochun Ye
Institute of Computing Technology, Chinese Academy of Sciences, China
Dongrui Fan
Dongrui Fan
Institute of Computing Technology, Chinese Academy of Sciences
Computer ArchitectureProcessor DesignMany-core Design