Data-Driven Analysis to Understand GPU Hardware Resource Usage of Optimizations

๐Ÿ“… 2024-08-19
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address low per-GPU utilization and suboptimal hardware return-on-investment in heterogeneous multi-GPU systems, this paper proposes a data-driven analytical framework that establishes, for the first time, interpretable correlations between optimization strategies and GPU resource usage patterns. Our method integrates hardware performance counter profiling, multi-objective correlation modeling, and scientific proxy application benchmarks to construct a multidimensional metric suite characterizing application-device interaction behaviors. Unlike prior workโ€”which focuses primarily on performance gainsโ€”our approach systematically uncovers the underlying mechanisms by which optimizations affect resource occupancy and utilization. Experimental evaluation on proxy applications demonstrates a 29.6% reduction in execution time, a 5.3% increase in average GPU utilization, and a 26.5% decrease in power consumption. These results establish a novel paradigm for resource-efficient, co-optimized heterogeneous accelerator systems.

Technology Category

Application Category

๐Ÿ“ Abstract
With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale. However, low utilization for single GPUs defies the need to invest more money for expensive ccelerators. While related work develops optimizations for improving application performance, none studies how these optimizations impact hardware resource usage or the average GPU utilization. This paper takes a data-driven analysis approach in addressing this gap by (1) characterizing how hardware resource usage affects device utilization, execution time, or both, (2) presenting a multi-objective metric to identify important application-device interactions that can be optimized to improve device utilization and application performance jointly, (3) studying hardware resource usage behaviors of several optimizations for a benchmark application, and finally (4) identifying optimization opportunities for several scientific proxy applications based on their hardware resource usage behaviors. Furthermore, we demonstrate the applicability of our methodology by applying the identified optimizations to a proxy application, which improves the execution time, device utilization and power consumption by up to 29.6%, 5.3% and 26.5% respectively.
Problem

Research questions and friction points this paper is trying to address.

Analyzes GPU hardware resource usage impact on utilization and performance.
Develops multi-objective metric to optimize application-device interactions.
Identifies optimization opportunities for scientific applications based on resource usage.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-driven analysis of GPU hardware resource usage
Multi-objective metric for application-device optimization
Identifies optimization opportunities for scientific applications
๐Ÿ”Ž Similar Papers
No similar papers found.
T
Tanzima Z. Islam
Department of Computer Science, Texas State University, San Marcos, TX 78666
Aniruddha Marathe
Aniruddha Marathe
Lawrence Livermore National Laboratory
Power-Aware HPCRun-time systemsCloud computing
H
Holland Schutte
M
Mohammad Zaeed