Boosting Scientific Error-Bounded Lossy Compression through Optimized Synergistic Lossy-Lossless Orchestration

📅 2025-07-15

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

To address the dual requirements of high compression ratios and low-latency, error-bounded compression for massive GPU-generated scientific data in HPC, this paper proposes cuSZ-Hi—the first GPU-native, error-bounded, lossless-lossy cooperative compression framework tailored for diverse scientific datasets. Its core contributions are: (1) a GPU-optimized adaptive parallel interpolation predictor that accurately captures complex local data features; (2) a tightly integrated lossless encoding pipeline that jointly optimizes quantization error control and entropy coding for synergistic gains; and (3) a domain-agnostic, fully open-source end-to-end architecture. Experiments demonstrate that, under identical error bounds, cuSZ-Hi achieves up to 249% higher compression ratios; under identical decompressed PSNR, it delivers up to 215% improvement in compression ratio, while matching or exceeding the throughput of state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

As high-performance computing architectures evolve, more scientific computing workflows are being deployed on advanced computing platforms such as GPUs. These workflows can produce raw data at extremely high throughputs, requiring urgent high-ratio and low-latency error-bounded data compression solutions. In this paper, we propose cuSZ-Hi, an optimized high-ratio GPU-based scientific error-bounded lossy compressor with a flexible, domain-irrelevant, and fully open-source framework design. Our novel contributions are: 1) We maximally optimize the parallelized interpolation-based data prediction scheme on GPUs, enabling the full functionalities of interpolation-based scientific data prediction that are adaptive to diverse data characteristics; 2) We thoroughly explore and investigate lossless data encoding techniques, then craft and incorporate the best-fit lossless encoding pipelines for maximizing the compression ratio of cuSZ-Hi; 3) We systematically evaluate cuSZ-Hi on benchmarking datasets together with representative baselines. Compared to existing state-of-the-art scientific lossy compressors, with comparative or better throughput than existing high-ratio scientific error-bounded lossy compressors on GPUs, cuSZ-Hi can achieve up to 249% compression ratio improvement under the same error bound, and up to 215% compression ratio improvement under the same decompression data PSNR.

Problem

Research questions and friction points this paper is trying to address.

Optimize GPU-based high-ratio error-bounded scientific data compression

Enhance compression ratio via adaptive interpolation-based prediction and lossless encoding

Evaluate performance against state-of-the-art compressors with significant ratio improvements

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimized GPU-based interpolation data prediction

Best-fit lossless encoding for high compression

Domain-irrelevant open-source framework design

🔎 Similar Papers

A Survey on Error-Bounded Lossy Compression for Scientific Datasets