Recorder: Comprehensive Parallel I/O Tracing and Analysis

📅 2025-01-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
To address the linear growth of I/O trace data with application scale, high storage overhead, and coarse-grained information in HPC applications, this paper proposes and implements Recorder, a parallel I/O tracing tool. Methodologically, Recorder employs multi-layer I/O interception and full-parameter capture to enable fine-grained behavioral modeling across POSIX, MPI-IO, and filesystem layers; introduces a distributed pattern recognition–based online compression algorithm that renders trace file size asymptotically constant—breaking the conventional linear-scaling bottleneck; and adopts a lightweight runtime injection mechanism to ensure minimal overhead. Evaluation shows Recorder reduces storage footprint by approximately 12× compared to Darshan, incurs <3% average performance overhead on real HPC workloads, and provides richer call-context information—enabling deep mechanistic I/O analysis and optimization.

Technology Category

Application Category

📝 Abstract
This paper presents Recorder, a parallel I/O tracing tool designed to capture comprehensive I/O information on HPC applications. Recorder traces I/O calls across various I/O layers, storing all function parameters for each captured call. The volume of stored information scales linearly the application's execution scale. To address this, we present a sophisticated pattern-recognition-based compression algorithm. This algorithm identifies and compresses recurring I/O patterns both within individual processes and across multiple processes, significantly reducing space and time overheads. We evaluate the proposed compression algorithm using I/O benchmarks and real-world applications, demonstrating that Recorder can store more information while requiring approximately 12x less storage space compared to its predecessor. Notably, for applications with typical parallel I/O patterns, Recorder achieves a constant trace size regardless of execution scale. Additionally, a comparison with the profiling tool Darshan shows that Recorder captures detailed I/O information without incurring substantial overhead. The richer data collected by Recorder enables new insights and facilitates more in-depth I/O studies, offering valuable contributions to the I/O research community.
Problem

Research questions and friction points this paper is trying to address.

High-Performance Computing
Data Flow Analysis
Scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recorder
Smart Compression
High-Performance Computing Data I/O
🔎 Similar Papers
No similar papers found.