Data-Driven Analysis to Understand GPU Hardware Resource Usage of Optimizations

📅 2024-08-19

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

To address low per-GPU utilization and suboptimal hardware return-on-investment in heterogeneous multi-GPU systems, this paper proposes a data-driven analytical framework that establishes, for the first time, interpretable correlations between optimization strategies and GPU resource usage patterns. Our method integrates hardware performance counter profiling, multi-objective correlation modeling, and scientific proxy application benchmarks to construct a multidimensional metric suite characterizing application-device interaction behaviors. Unlike prior work—which focuses primarily on performance gains—our approach systematically uncovers the underlying mechanisms by which optimizations affect resource occupancy and utilization. Experimental evaluation on proxy applications demonstrates a 29.6% reduction in execution time, a 5.3% increase in average GPU utilization, and a 26.5% decrease in power consumption. These results establish a novel paradigm for resource-efficient, co-optimized heterogeneous accelerator systems.

Technology Category

Application Category

📝 Abstract

With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale. However, low utilization for single GPUs defies the need to invest more money for expensive ccelerators. While related work develops optimizations for improving application performance, none studies how these optimizations impact hardware resource usage or the average GPU utilization. This paper takes a data-driven analysis approach in addressing this gap by (1) characterizing how hardware resource usage affects device utilization, execution time, or both, (2) presenting a multi-objective metric to identify important application-device interactions that can be optimized to improve device utilization and application performance jointly, (3) studying hardware resource usage behaviors of several optimizations for a benchmark application, and finally (4) identifying optimization opportunities for several scientific proxy applications based on their hardware resource usage behaviors. Furthermore, we demonstrate the applicability of our methodology by applying the identified optimizations to a proxy application, which improves the execution time, device utilization and power consumption by up to 29.6%, 5.3% and 26.5% respectively.

Problem

Research questions and friction points this paper is trying to address.

Analyzes GPU hardware resource usage impact on utilization and performance.

Develops multi-objective metric to optimize application-device interactions.

Identifies optimization opportunities for scientific applications based on resource usage.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-driven analysis of GPU hardware resource usage

Multi-objective metric for application-device optimization

Identifies optimization opportunities for scientific applications

🔎 Similar Papers

No similar papers found.