HPDR: High-Performance Portable Scientific Data Reduction Framework

📅 2025-03-08

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

To address storage, transmission, and analysis bottlenecks arising from exascale scientific data volumes, this paper proposes a lightweight data reduction framework that balances GPU acceleration with cross-platform portability. The framework integrates heterogeneous computing scheduling, zero-copy memory access, adaptive lossy compression, and MPI+GPU co-optimized I/O. It reduces memory movement overhead to just 2.3% and achieves near-linear multi-GPU scalability (96% of theoretical speedup). On the Frontier supercomputer, it delivers an end-to-end throughput of 103 TB/s, improving parallel I/O performance by up to 3.5× over state-of-the-art solutions—yielding a 4× overall speedup. Its core innovation lies in the first unified realization of high throughput, low overhead, and cross-architecture portability, establishing an efficient, scalable compression and reduction infrastructure for exascale data processing.

Technology Category

Application Category

📝 Abstract

The rapid growth of scientific data is surpassing advancements in computing, creating challenges in storage, transfer, and analysis, particularly at the exascale. While data reduction techniques such as lossless and lossy compression help mitigate these issues, their computational overhead introduces new bottlenecks. GPU-accelerated approaches improve performance but face challenges in portability, memory transfer, and scalability on multi-GPU systems. To address these, we propose HPDR, a high-performance, portable data reduction framework. HPDR supports diverse processor architectures, reducing memory transfer overhead to 2.3% and achieving up to 3.5x faster throughput than existing solutions. It attains 96% of the theoretical speedup in multi-GPU settings. Evaluations on the Frontier supercomputer demonstrate 103 TB/s throughput and up to 4x acceleration in parallel I/O performance at scale. HPDR offers a scalable, efficient solution for managing massive data volumes in exascale computing environments.

Problem

Research questions and friction points this paper is trying to address.

Addresses challenges in storage, transfer, and analysis of rapidly growing scientific data.

Reduces computational overhead and memory transfer in data reduction techniques.

Improves portability, scalability, and performance in multi-GPU systems.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Portable framework for diverse processor architectures

Reduces memory transfer overhead to 2.3%

Achieves 96% theoretical speedup in multi-GPU settings

🔎 Similar Papers

No similar papers found.