🤖 AI Summary
Energy efficiency characterization of exascale scientific applications remains challenging due to hardware heterogeneity and insufficient cross-platform benchmarking. Method: This study systematically quantifies the energy consumption characteristics of QMCPACK (a particle-based solver) and AMReX-Castro (a grid-based solver) on NVIDIA A100/H100 and AMD MI250X GPUs, using millisecond-resolution hardware telemetry (via NVML/rocm-smi), mixed-precision (FP64/FP32) benchmarks, and application-specific energy-efficiency metrics. Contribution/Results: It presents the first cross-vendor, application-level energy-efficiency comparison for exascale workloads. Key findings include up to 45% GPU energy reduction with mixed precision for AMReX-Castro and 6–25% for QMCPACK; identification of monitoring gaps in AMD’s toolchain on the Frontier system; and empirical validation that high-temporal-resolution power sampling (1 ms–1 s) is critical for accurate energy modeling. These results provide empirically grounded trade-off insights and optimization pathways for co-design of hardware and software in the post-Moore era.
📝 Abstract
We characterize the GPU energy usage of two widely adopted exascale-ready applications representing two classes of particle and mesh solvers: (i) QMCPACK, a quantum Monte Carlo package, and (ii) AMReX-Castro, an adaptive mesh astrophysical code. We analyze power, temperature, utilization, and energy traces from double-/single (mixed)-precision benchmarks on NVIDIA's A100 and H100 and AMD's MI250X GPUs using queries in NVML and rocm smi lib, respectively. We explore application-specific metrics to provide insights on energy vs. performance trade-offs. Our results suggest that mixed-precision energy savings range between 6-25% on QMCPACK and 45% on AMReX-Castro. Also there are still gaps in the AMD tooling on Frontier GPUs that need to be understood, while query resolutions on NVML have little variability between 1 ms and 1 s. Overall, application level knowledge is crucial to define energy-cost/science-benefit opportunities for the codesign of future supercomputer architectures in the post-Moore era.