PSZ: Enhancing the SZ Scientific Lossy Compressor With Progressive Data Retrieval

📅 2025-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of simultaneously achieving high compression ratios and low operational overhead in progressive scientific data compression amid explosive data growth, this paper proposes the first interpolation-based lossy compression method supporting progressive retrieval. We innovatively extend high-precision interpolation modeling into a unified framework integrating multi-level bitplane coding, predictive coding, and error-characteristic-driven progressive bitstream organization. Furthermore, we derive an optimal progressive retrieval strategy tailored to user-specified error bounds or target bitrates. Experimental results demonstrate that, compared with state-of-the-art progressive compressors, our method achieves a 487% improvement in compression ratio and a 698% speedup in decompression throughput. Under identical error bounds, it reduces the volume of retrieved data by 83%; under identical bitrates, it lowers reconstruction error by 99%.

Technology Category

Application Category

📝 Abstract
Compression is a crucial solution for data reduction in modern scientific applications due to the exponential growth of data from simulations, experiments, and observations. Compression with progressive retrieval capability allows users to access coarse approximations of data quickly and then incrementally refine these approximations to higher fidelity. Existing progressive compression solutions suffer from low reduction ratios or high operation costs, effectively undermining the approach's benefits. In this paper, we propose the first-ever interpolation-based progressive lossy compression solution that has both high reduction ratios and low operation costs. The interpolation-based algorithm has been verified as one of the best for scientific data reduction, but previously no effort exists to make it support progressive retrieval. Our contributions are three-fold: (1) We thoroughly analyze the error characteristics of the interpolation algorithm and propose our solution IPComp with multi-level bitplane and predictive coding. (2) We derive optimized strategies toward minimum data retrieval under different fidelity levels indicated by users through error bounds and bitrates. (3) We evaluate the proposed solution using six real-world datasets from four diverse domains. Experimental results demonstrate our solution archives up to $487%$ higher compression ratios and $698%$ faster speed than other state-of-the-art progressive compressors, and reduces the data volume for retrieval by up to $83%$ compared to baselines under the same error bound, and reduces the error by up to $99%$ under the same bitrate.
Problem

Research questions and friction points this paper is trying to address.

Enhancing data compression efficiency
Reducing operation costs in compression
Supporting progressive data retrieval
Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpolation-based progressive compression
Multi-level bitplane predictive coding
Optimized data retrieval strategies
🔎 Similar Papers
Zhuoxun Yang
Zhuoxun Yang
Florida State University
data compressionhigh performance computing
Sheng Di
Sheng Di
Argonne National Labratory, IEEE Senior Member
HPCData CompressionResilienceCloud/Grid Computing/P2PFederated Learning
R
Ruoyu Li
Florida State University, Tallahassee, FL, USA
X
Ximiao Li
Florida State University, Tallahassee, FL, USA
Longtao Zhang
Longtao Zhang
Florida State University
High Performance ComputingLossy Compression
J
Jiajun Huang
University of California, Riverside, Riverside, CA, USA
J
Jinyang Liu
University of Houston, Houston, TX, USA
Franck Cappello
Franck Cappello
Argonne National Laboratory, IEEE Fellow
Parallel ProcessingParallel ComputingHigh Performance ComputingFault ToleranceData Compression
K
Kai Zhao
Florida State University, Tallahassee, FL, USA