Enabling Homomorphic Analytical Operations on Compressed Scientific Data with Multi-stage Decompression

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high latency introduced by traditional decompression in scientific data analysis, which undermines the storage and transmission benefits of compression. To overcome this limitation, the authors propose a multi-stage, error-bounded decompression and homomorphic analysis framework. By abstracting a generic compression pipeline, the framework enables hierarchical partial decompression and introduces homomorphic operation algorithms tailored to three representative scientific analysis tasks, allowing computations to be performed directly on intermediate compressed representations without full decompression. Implemented atop four mainstream compressors and evaluated across five real-world datasets, the approach consistently reduces data access latency and significantly improves analytical efficiency across diverse workloads.

Technology Category

Application Category

📝 Abstract
Error-controlled lossy compressors have been widely used in scientific applications to reduce the unprecedented size of scientific data while keeping data distortion within a user-specified threshold. While they significantly mitigate the pressure for data storage and transmission, they prolong the time to access the data because decompression is required to transform the binary compressed data into meaningful floating-point numbers. This incurs noticeable overhead for common analytical operations on scientific data that extract or derive useful information, because the time cost of the operations could be much lower than that of decompression. In this work, we design an error-controlled lossy compression and analytical framework that features multi-stage decompression and homomorphic analytical operation algorithms on intermediate decompressed data for reduced data access latency. Our contributions are threefold. (1) We abstract a generic compression pipeline with partial decompression to multiple intermediate data representations and implement four instances based on state-of-the-art high-throughput scientific data compressors. (2) We carefully design homomorphic algorithms to enable direct operations on intermediate decompressed data for three types of analytical operations on scientific data. (3) We evaluate our approach using five real-world scientific datasets. Experimental evaluations demonstrate that our method achieves significant speedups when performing analytical operations on compressed scientific data across all three targeted analytical operation types.
Problem

Research questions and friction points this paper is trying to address.

lossy compression
scientific data
data decompression
analytical operations
access latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

homomorphic analytical operations
multi-stage decompression
error-controlled lossy compression
compressed data analytics
intermediate data representation
🔎 Similar Papers
No similar papers found.