Boosting Scientific Error-Bounded Lossy Compression through Optimized Synergistic Lossy-Lossless Orchestration

πŸ“… 2025-07-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the dual requirements of high compression ratios and low-latency, error-bounded compression for massive GPU-generated scientific data in HPC, this paper proposes cuSZ-Hiβ€”the first GPU-native, error-bounded, lossless-lossy cooperative compression framework tailored for diverse scientific datasets. Its core contributions are: (1) a GPU-optimized adaptive parallel interpolation predictor that accurately captures complex local data features; (2) a tightly integrated lossless encoding pipeline that jointly optimizes quantization error control and entropy coding for synergistic gains; and (3) a domain-agnostic, fully open-source end-to-end architecture. Experiments demonstrate that, under identical error bounds, cuSZ-Hi achieves up to 249% higher compression ratios; under identical decompressed PSNR, it delivers up to 215% improvement in compression ratio, while matching or exceeding the throughput of state-of-the-art methods.

Technology Category

Application Category

πŸ“ Abstract
As high-performance computing architectures evolve, more scientific computing workflows are being deployed on advanced computing platforms such as GPUs. These workflows can produce raw data at extremely high throughputs, requiring urgent high-ratio and low-latency error-bounded data compression solutions. In this paper, we propose cuSZ-Hi, an optimized high-ratio GPU-based scientific error-bounded lossy compressor with a flexible, domain-irrelevant, and fully open-source framework design. Our novel contributions are: 1) We maximally optimize the parallelized interpolation-based data prediction scheme on GPUs, enabling the full functionalities of interpolation-based scientific data prediction that are adaptive to diverse data characteristics; 2) We thoroughly explore and investigate lossless data encoding techniques, then craft and incorporate the best-fit lossless encoding pipelines for maximizing the compression ratio of cuSZ-Hi; 3) We systematically evaluate cuSZ-Hi on benchmarking datasets together with representative baselines. Compared to existing state-of-the-art scientific lossy compressors, with comparative or better throughput than existing high-ratio scientific error-bounded lossy compressors on GPUs, cuSZ-Hi can achieve up to 249% compression ratio improvement under the same error bound, and up to 215% compression ratio improvement under the same decompression data PSNR.
Problem

Research questions and friction points this paper is trying to address.

Optimize GPU-based high-ratio error-bounded scientific data compression
Enhance compression ratio via adaptive interpolation-based prediction and lossless encoding
Evaluate performance against state-of-the-art compressors with significant ratio improvements
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimized GPU-based interpolation data prediction
Best-fit lossless encoding for high compression
Domain-irrelevant open-source framework design
πŸ”Ž Similar Papers
No similar papers found.
S
Shixun Wu
University of California, Riverside
J
Jinwen Pan
Technical University of Munich
J
Jinyang Liu
University of Houston
Jiannan Tian
Jiannan Tian
Assistant Professor, Oakland University
HPC/AIlarge-scale data processing and analyticsHW-accelerated compression
Z
Ziwei Qiu
University of Houston
J
Jiajun Huang
University of South Florida
K
Kai Zhao
Florida State University
X
Xin Liang
University of Kentucky
Sheng Di
Sheng Di
Argonne National Labratory, IEEE Senior Member
HPCData CompressionResilienceCloud/Grid Computing/P2PFederated Learning
Z
Zizhong Chen
University of California, Riverside
Franck Cappello
Franck Cappello
Argonne National Laboratory, IEEE Fellow
Parallel ProcessingParallel ComputingHigh Performance ComputingFault ToleranceData Compression