🤖 AI Summary
To address the theoretical inefficiency, limited parallelism, and performance bottlenecks of k-core decomposition on large-scale graphs, this paper proposes the first general-purpose framework achieving both asymptotically optimal work complexity and high practical parallelism. Our method integrates parallel algorithm design, adaptive sampling, dynamic granularity control, and shared-memory optimizations. Key contributions include: (1) a novel sampling-based contention mitigation mechanism that reduces competition among high-degree vertices; (2) vertically granular dynamic scheduling, substantially lowering scheduling overhead; and (3) a hierarchical bucket structure enabling efficient processing of high-core-value graphs. Evaluated on a 96-core server across 25 real-world graphs, our framework outperforms state-of-the-art methods—including ParK, PKC, and Julienne—on 23 datasets, achieving up to 315× speedup. The framework thus bridges the gap between theoretical optimality and empirical scalability in parallel k-core decomposition.
📝 Abstract
This paper proposes efficient solutions for $k$-core decomposition with high parallelism. The problem of $k$-core decomposition is fundamental in graph analysis and has applications across various domains. However, existing algorithms face significant challenges in achieving work-efficiency in theory and/or high parallelism in practice, and suffer from various performance bottlenecks. We present a simple, work-efficient parallel framework for $k$-core decomposition that is easy to implement and adaptable to various strategies for improving work-efficiency. We introduce two techniques to enhance parallelism: a sampling scheme to reduce contention on high-degree vertices, and vertical granularity control (VGC) to mitigate scheduling overhead for low-degree vertices. Furthermore, we design a hierarchical bucket structure to optimize performance for graphs with high coreness values. We evaluate our algorithm on a diverse set of real-world and synthetic graphs. Compared to state-of-the-art parallel algorithms, including ParK, PKC, and Julienne, our approach demonstrates superior performance on 23 out of 25 graphs when tested on a 96-core machine. Our algorithm shows speedups of up to 315$ imes$ over ParK, 33.4$ imes$ over PKC, and 52.5$ imes$ over Julienne.