🤖 AI Summary
Scientific applications generate massive datasets, yet existing progressive compression methods are predominantly CPU-oriented and fail to harness GPU heterogeneous computing capabilities. This paper introduces the first high-performance, portable GPU-accelerated framework for progressive data reconstruction and retrieval. Its core contributions are: (1) the first systematic optimization of GPU-accelerated bitplane encoding and adaptive lossless compression; (2) a multi-stage pipelined architecture that tightly integrates reconstruction and progressive retrieval; and (3) error-aware hierarchical reconstruction coupled with interest-metric-driven on-demand decoding. Experiments demonstrate that the framework achieves 6.6× higher throughput for joint reconstruction and progressive retrieval compared to state-of-the-art methods; under error constraints, reconstruction throughput improves by 10.4×; and end-to-end retrieval performance increases by 4.2×.
📝 Abstract
Scientific applications produce vast amounts of data, posing grand challenges in the underlying data management and analytic tasks. Progressive compression is a promising way to address this problem, as it allows for on-demand data retrieval with significantly reduced data movement cost. However, most existing progressive methods are designed for CPUs, leaving a gap for them to unleash the power of today's heterogeneous computing systems with GPUs. In this work, we propose HP-MDR, a high-performance and portable data refactoring and progressive retrieval framework for GPUs. Our contributions are three-fold: (1) We carefully optimize the bitplane encoding and lossless encoding, two key stages in progressive methods, to achieve high performance on GPUs; (2) We propose pipeline optimization and incorporate it with data refactoring and progressive retrieval workflows to further enhance the performance for large data process; (3) We leverage our framework to enable high-performance data retrieval with guaranteed error control for common Quantities of Interest; (4) We evaluate HP-MDR and compare it with state of the arts using five real-world datasets. Experimental results demonstrate that HP-MDR delivers up to 6.6x throughput in data refactoring and progressive retrieval tasks. It also leads to 10.4x throughput for recomposing required data representations under Quantity-of-Interest error control and 4.2x performance for the corresponding end-to-end data retrieval, when compared with state-of-the-art solutions.