New Parallel and Streaming Algorithms for Directed Densest Subgraph

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the approximate densest subgraph problem on large-scale directed graphs under the Massively Parallel Computation (MPC) and semi-streaming models. We propose the first deterministic single-pass semi-streaming algorithm supporting insert-only dynamic updates, achieving worst-case update time O(log²n)—breaking the known approximation lower bound for single-pass streaming. In the MPC model, we design a (2+ε)-approximation algorithm requiring only Õ(√log n) rounds. Our theoretical guarantees are strong: the semi-streaming algorithm achieves approximation ratios close to those of multi-pass algorithms in practice, while requiring significantly fewer iterations. The core contribution is the first deterministic single-pass semi-streaming solution for the densest subgraph problem—achieving simultaneously high efficiency, low round complexity, and strong approximation guarantees.

Technology Category

Application Category

📝 Abstract
Finding dense subgraphs is a fundamental problem with applications to community detection, clustering, and data mining. Our work focuses on finding approximate densest subgraphs in directed graphs in computational models for processing massive data. We consider two such models: Massively Parallel Computation (MPC) and semi-streaming. We show how to find a $(2+varepsilon)$-approximation in $ ilde{O}(sqrt{log n})$ MPC rounds with sublinear memory per machine. This improves the state-of-the-art results by Bahmani et al. (WAW 2014) and Mitrović & Pan (ICML 2024). Moreover, we show how to find an $O(log n)$-approximation in a single pass in semi-streaming. This is in stark contrast to prior work, which implies $ ildeΩ(n^{1/6})$-approximation for a single pass; a better approximation is known only for randomized streams (Mitrović & Pan). This is the first deterministic single-pass semi-streaming algorithm for the densest subgraph problem, both for undirected and directed graphs. Our semi-streaming approach is also an insertion-only dynamic algorithm, attaining the first directed densest subgraph algorithm with $O(log^2 n)$ worst-case update time while using sub-linear memory. We empirically evaluate our approaches in two ways. First, we illustrate that our single-pass semi-streaming algorithm performs much better than the theoretical guarantee. Specifically, its approximation on temporal datasets matches the $(2+varepsilon)$-approximation of an $O(log n)$-pass algorithm by Bahmani et al. (VLDB 2012). Second, we demonstrate that our MPC algorithm requires fewer rounds than prior work.
Problem

Research questions and friction points this paper is trying to address.

Finding approximate densest subgraphs in directed graphs
Developing algorithms for Massively Parallel Computation and semi-streaming models
Improving approximation ratios and efficiency for massive data processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel algorithm for directed densest subgraph approximation
Single-pass semi-streaming algorithm with logarithmic approximation
Dynamic algorithm with sublinear memory and update time
🔎 Similar Papers
No similar papers found.
Slobodan Mitrović
Slobodan Mitrović
UC Davis
Distributed graph algorithms. Streaming graph algorithms. Combinatorial optimization. Fast graph algorithms.
T
Theodore Pan
University of California, Davis
M
Mahdi Qaempanah
Sharif University of Technology
M
Mohammad Amin Raeisi
Yale University