🤖 AI Summary
Addressing the challenge of simultaneously achieving high compression throughput and controllable reconstruction error for irregular, large-scale particle-based simulations and point-cloud data on modern GPUs, this paper proposes a hardware-aware, error-bounded lossy compression framework. Our method introduces an innovative four-stage parallel pipelined architecture that jointly optimizes kernel scheduling, memory access patterns, and Streaming Multiprocessor (SM) occupancy. It integrates GPU-accelerated parallel entropy coding, adaptive per-block error control, and fine-grained memory layout optimization. Evaluated on six real-world scientific datasets, our framework outperforms five state-of-the-art GPU compressors: it achieves up to 8× higher end-to-end throughput while delivering superior compression ratios and reconstruction fidelity—approaching the theoretical hardware throughput limit of contemporary GPUs.
📝 Abstract
Particle-based simulations and point-cloud applications generate massive, irregular datasets that challenge storage, I/O, and real-time analytics. Traditional compression techniques struggle with irregular particle distributions and GPU architectural constraints, often resulting in limited throughput and suboptimal compression ratios. In this paper, we present GPZ, a high-performance, error-bounded lossy compressor designed specifically for large-scale particle data on modern GPUs. GPZ employs a novel four-stage parallel pipeline that synergistically balances high compression efficiency with the architectural demands of massively parallel hardware. We introduce a suite of targeted optimizations for computation, memory access, and GPU occupancy that enables GPZ to achieve near-hardware-limit throughput. We conduct an extensive evaluation on three distinct GPU architectures (workstation, data center, and edge) using six large-scale, real-world scientific datasets from five distinct domains. The results demonstrate that GPZ consistently and significantly outperforms five state-of-the-art GPU compressors, delivering up to 8x higher end-to-end throughput while simultaneously achieving superior compression ratios and data quality.