A Survey on Error-Bounded Lossy Compression for Scientific Datasets

📅 2024-04-03
🏛️ arXiv.org
📈 Citations: 6
Influential: 0
📄 PDF
🤖 AI Summary
The field of bounded-lossy compression for scientific data lacks a systematic survey and unified classification framework. Method: This work proposes the first six-dimensional taxonomy, categorizing existing approaches into six model classes; systematically analyzes 46 state-of-the-art compressors—elucidating their design principles, error-control mechanisms, and application domains—and distills five core technical components: predictive coding, transform coding, quantization, entropy coding, and parallel/distributed architectures. Furthermore, it establishes domain-specific compression design methodologies and selection guidelines tailored to high-performance computing (HPC), climate modeling, and particle physics. Contribution/Results: The framework enables high-fidelity, high-ratio scientific data management and has become a benchmark reference across multiple disciplines.

Technology Category

Application Category

📝 Abstract
Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. They are designed with distinct compression models and principles, such that each of them features particular pros and cons. In this paper we provide a comprehensive survey of emerging error-bounded lossy compression techniques. The key contribution is fourfold. (1) We summarize a novel taxonomy of lossy compression into 6 classic models. (2) We provide a comprehensive survey of 10 commonly used compression components/modules. (3) We summarized pros and cons of 46 state-of-the-art lossy compressors and present how state-of-the-art compressors are designed based on different compression techniques. (4) We discuss how customized compressors are designed for specific scientific applications and use-cases. We believe this survey is useful to multiple communities including scientific applications, high-performance computing, lossy compression, and big data.
Problem

Research questions and friction points this paper is trying to address.

Survey error-bounded lossy compression techniques
Evaluate 46 state-of-the-art lossy compressors
Design compressors for specific scientific applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Error-bounded lossy compression
Comprehensive survey of techniques
Customized compressors for applications
🔎 Similar Papers
No similar papers found.