๐ค AI Summary
This study addresses a critical performance bottleneck in GPU-accelerated data processing caused by the use of CPU-oriented default configurations in Parquet files, which fail to exploit the parallelism offered by modern GPUs. The authors systematically analyze how Parquet configuration parameters impact GPU scan performance and propose a novel approach that optimizes data layout and I/O strategies through GPU-aware configuration tuningโwithout modifying the Parquet format specification. Their work demonstrates for the first time that the bottleneck stems from suboptimal configuration choices rather than inherent limitations of the format itself. By tailoring configurations to GPU architectural characteristics, the method achieves an effective read bandwidth of up to 125 GB/s, substantially alleviating the GPU scan bottleneck while remaining fully compliant with the existing Parquet standard.
๐ Abstract
Parquet is the de facto columnar file format in modern analytical systems, yet its configuration guidelines have largely been shaped by CPU-centric execution models. As GPU-accelerated data processing becomes increasingly prevalent, Parquet files generated with CPU-oriented defaults can severely underutilize GPU parallelism, turning GPU scans into a performance bottleneck.
In this work, we systematically study how Parquet configurations affect GPU scan performance. We show that Parquet's poor GPU performance is not inherent to the format itself but rather a consequence of suboptimal configuration choices. By applying GPU-aware configurations, we increase effective read bandwidth up to 125 GB/s without modifying the Parquet specification.