ACGraph: An Efficient Asynchronous Out-of-Core Graph Processing Framework

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Large-scale graph data often exceeds main-memory capacity, while existing out-of-core graph processing systems suffer from inefficient I/O (i.e., high read and work amplification) and severe synchronization stalls due to rigidly synchronized iterations, leading to underutilized SSDs. To address these challenges, this paper proposes AsyncGraph—a novel asynchronous out-of-core graph processing framework designed for SSDs. Its core contributions are: (1) a workload-aware dynamic block-level priority scheduler coupled with an online asynchronous worklist, significantly reducing redundant disk accesses; and (2) deep pipelining of computation and asynchronous I/O, enhanced by a hybrid storage format (optimized for low-degree vertex access) and an active-block in-memory reuse mechanism, thereby sustaining high SSD throughput. Evaluated on BFS, WCC, and PageRank, AsyncGraph achieves an average 2.3× speedup and 41% higher I/O efficiency over state-of-the-art systems.

Technology Category

Application Category

📝 Abstract

Graphs are a ubiquitous data structure in diverse domains such as machine learning, social networks, and data mining. As real-world graphs continue to grow beyond the memory capacity of single machines, out-of-core graph processing systems have emerged as a viable solution. Yet, existing systems that rely on strictly synchronous, iteration-by-iteration execution incur significant overheads. In particular, their scheduling mechanisms lead to I/O inefficiencies, stemming from read and work amplification, and induce costly synchronization stalls hindering sustained disk utilization. To overcome these limitations, we present {em ACGraph}, a novel asynchronous graph processing system optimized for SSD-based environments with constrained memory resources. ACGraph employs a dynamic, block-centric priority scheduler that adjusts in real time based on workload, along with an online asynchronous worklist that minimizes redundant disk accesses by efficiently reusing active blocks in memory. Moreover, ACGraph unifies asynchronous I/O with computation in a pipelined execution model that maintains sustained I/O activation, and leverages a highly optimized hybrid storage format to expedite access to low-degree vertices. We implement popular graph algorithms, such as Breadth-First Search (BFS), Weakly Connected Components (WCC), personalized PageRank (PPR), PageRank (PR), and $k$-core on ACGraph and demonstrate that ACGraph substantially outperforms state-of-the-art out-of-core graph processing systems in both runtime and I/O efficiency.

Problem

Research questions and friction points this paper is trying to address.

Improving I/O efficiency in out-of-core graph processing

Reducing synchronization overheads in large-scale graph computations

Optimizing memory usage for SSD-based graph systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Asynchronous execution model for out-of-core graph processing

Dynamic block-centric scheduler adapting to workload changes

Pipelined I/O and computation with hybrid storage format

🔎 Similar Papers

FedGraph: A Research Library and Benchmark for Federated Graph Learning

2024-10-08arXiv.orgCitations: 0

Intel

$255,850.00-361,200.00 USD

US, California, Santa Clara / US, Oregon, Hillsboro / US, Texas, Austin

Research Scientist