PSZ: Enhancing the SZ Scientific Lossy Compressor With Progressive Data Retrieval

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

To address the challenge of simultaneously achieving high compression ratios and low operational overhead in progressive scientific data compression amid explosive data growth, this paper proposes the first interpolation-based lossy compression method supporting progressive retrieval. We innovatively extend high-precision interpolation modeling into a unified framework integrating multi-level bitplane coding, predictive coding, and error-characteristic-driven progressive bitstream organization. Furthermore, we derive an optimal progressive retrieval strategy tailored to user-specified error bounds or target bitrates. Experimental results demonstrate that, compared with state-of-the-art progressive compressors, our method achieves a 487% improvement in compression ratio and a 698% speedup in decompression throughput. Under identical error bounds, it reduces the volume of retrieved data by 83%; under identical bitrates, it lowers reconstruction error by 99%.

Technology Category

Application Category

📝 Abstract

Compression is a crucial solution for data reduction in modern scientific applications due to the exponential growth of data from simulations, experiments, and observations. Compression with progressive retrieval capability allows users to access coarse approximations of data quickly and then incrementally refine these approximations to higher fidelity. Existing progressive compression solutions suffer from low reduction ratios or high operation costs, effectively undermining the approach's benefits. In this paper, we propose the first-ever interpolation-based progressive lossy compression solution that has both high reduction ratios and low operation costs. The interpolation-based algorithm has been verified as one of the best for scientific data reduction, but previously no effort exists to make it support progressive retrieval. Our contributions are three-fold: (1) We thoroughly analyze the error characteristics of the interpolation algorithm and propose our solution IPComp with multi-level bitplane and predictive coding. (2) We derive optimized strategies toward minimum data retrieval under different fidelity levels indicated by users through error bounds and bitrates. (3) We evaluate the proposed solution using six real-world datasets from four diverse domains. Experimental results demonstrate our solution archives up to $487%$ higher compression ratios and $698%$ faster speed than other state-of-the-art progressive compressors, and reduces the data volume for retrieval by up to $83%$ compared to baselines under the same error bound, and reduces the error by up to $99%$ under the same bitrate.

Problem

Research questions and friction points this paper is trying to address.

Enhancing data compression efficiency

Reducing operation costs in compression

Supporting progressive data retrieval

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interpolation-based progressive compression

Multi-level bitplane predictive coding

Optimized data retrieval strategies

🔎 Similar Papers

A Survey on Error-Bounded Lossy Compression for Scientific Datasets