HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs

📅 2025-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scientific applications generate massive datasets, yet existing progressive compression methods are predominantly CPU-oriented and fail to harness GPU heterogeneous computing capabilities. This paper introduces the first high-performance, portable GPU-accelerated framework for progressive data reconstruction and retrieval. Its core contributions are: (1) the first systematic optimization of GPU-accelerated bitplane encoding and adaptive lossless compression; (2) a multi-stage pipelined architecture that tightly integrates reconstruction and progressive retrieval; and (3) error-aware hierarchical reconstruction coupled with interest-metric-driven on-demand decoding. Experiments demonstrate that the framework achieves 6.6× higher throughput for joint reconstruction and progressive retrieval compared to state-of-the-art methods; under error constraints, reconstruction throughput improves by 10.4×; and end-to-end retrieval performance increases by 4.2×.

Technology Category

Application Category

📝 Abstract
Scientific applications produce vast amounts of data, posing grand challenges in the underlying data management and analytic tasks. Progressive compression is a promising way to address this problem, as it allows for on-demand data retrieval with significantly reduced data movement cost. However, most existing progressive methods are designed for CPUs, leaving a gap for them to unleash the power of today's heterogeneous computing systems with GPUs. In this work, we propose HP-MDR, a high-performance and portable data refactoring and progressive retrieval framework for GPUs. Our contributions are three-fold: (1) We carefully optimize the bitplane encoding and lossless encoding, two key stages in progressive methods, to achieve high performance on GPUs; (2) We propose pipeline optimization and incorporate it with data refactoring and progressive retrieval workflows to further enhance the performance for large data process; (3) We leverage our framework to enable high-performance data retrieval with guaranteed error control for common Quantities of Interest; (4) We evaluate HP-MDR and compare it with state of the arts using five real-world datasets. Experimental results demonstrate that HP-MDR delivers up to 6.6x throughput in data refactoring and progressive retrieval tasks. It also leads to 10.4x throughput for recomposing required data representations under Quantity-of-Interest error control and 4.2x performance for the corresponding end-to-end data retrieval, when compared with state-of-the-art solutions.
Problem

Research questions and friction points this paper is trying to address.

Addresses high-performance data refactoring for GPUs
Optimizes progressive retrieval with error control
Enhances throughput in scientific data management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizes bitplane and lossless encoding for GPUs
Proposes pipeline optimization for large data processing
Enables high-performance retrieval with error control
🔎 Similar Papers
No similar papers found.
Y
Yanliang Li
University of Oregon, OR, USA
Wenbo Li
Wenbo Li
The Chinese University of Hong Kong
Computer VisionDeep Learning
Qian Gong
Qian Gong
Oak Ridge National Lab, Fermilab, Duke University
Lossy CompressionGPU & Parallel ComputingNetwork traffic analysisDeep LearningX-ray Physics Simulation
Q
Qing Liu
New Jersey Institute of Technology, NJ, USA
N
N. Podhorszki
Oak Ridge National Laboratory, TN, USA
S
S. Klasky
Oak Ridge National Laboratory, TN, USA
X
Xin Liang
University of Kentucky, KY, USA
Jieyang Chen
Jieyang Chen
University of Oregon
HPCGPUML System OptimizationScientific Data